Take-home Exercise 3: Provincial Competitiveness Index influence on FDI in Vietnam

Author

kai feng

Published

October 15, 2024

Modified

November 12, 2024

Introduction

Provincial Competitiveness Index in Vietnam

Context: Vietnam’s provinces vary significantly in competitiveness, as captured by the Provincial Competitiveness Index (PCI). This index evaluates key dimensions such as entry costs, land access, transparency, and labor policies, which influence the investment climate and economic potential of each region.

Challenges: Provinces aiming to attract investment face challenges related to regional disparities and governance effectiveness. Understanding PCI dimensions is essential for identifying strengths and areas for improvement.

Analysis Focus

Objectives: This analysis aims to evaluate PCI dimensions through linear regression, examining their correlation with FDI projects and FDI registered capital inflow to identify combinations that drive investment.

Goals:

  • Identify Key Factors: Determine which PCI dimensions most influence FDI total projects and FDI total registered capital.

  • Province-Specific Insights: Highlight PCI factors lacking in specific provinces to guide policymaking.

  • Actionable Recommendations: Provide targeted suggestions for enhancing PCI dimensions to improve the investment climate.

Significance

This project will analyze how PCI dimensions affect Vietnam’s economic landscape, offering actionable insights to help policymakers enhance regional competitiveness and stimulate sustainable development.



1.0 Setup

1.1 Installing R-Packages

  • sf:

    • For handling spatial vector data and transforming it into simple features (sf) objects.

    • Functions like st_read() for importing spatial data and st_transform() for coordinate reference system transformations.

  • tidyverse: For data manipulation and transformation, including functions for working with tibble data frames.

  • readr: For reading in CSV or other text-based data files.

  • openxlsx, readxl: For reading or exporting in XLSX

  • dplyr: provide data manipulation capabilities (eg. to group and summarize the relationships between these columns)

  • knitr, gtsummary: For styling table
  • tmap: For creating thematic maps

  • animation, png, magick: For animation work

  • sfdep: For performing both local and global spatial autocorrelation analysis
  • ggstatsplot: to visualize relationships with statistical details
  • olsrr: R package for building OLS and performing diagnostics test
  • performance: to visually compare between models
  • GWmodel
pacman::p_load(tidyverse, sf, readr, tmap, dplyr, knitr, animation, png, magick, openxlsx, readxl, sfdep, ggstatsplot, olsrr, performance, gtsummary, GWmodel)


1.2 Data Acquisition

We will be using these dataset:

  • Source: Vietnam Statistics Office , Provincial Competitiveness Index

  • Provincial Competitiveness Index (PCI): To evaluate the competitive environment of each province, identifying strengths and weaknesses that influence investment potential.

  • Foreign Direct Investment (FDI): To assess the attractiveness of provinces for foreign investors and identify trends in investment across different sectors.


1.3 Data Preparation and Wrangling

provincial_boundaries <- st_read(dsn = "data/boundaries/provincial", layer="geoBoundaries-VNM-ADM1")
class(provincial_boundaries)
st_crs(provincial_boundaries)

provincial_boundaries <- provincial_boundaries %>%
  st_transform(crs = 3405) # Transform coordinate

# Drop & Rename column
provincial_boundaries <- provincial_boundaries %>% 
  select(shapeName, shapeISO, shapeGroup, geometry) %>% 
  rename(
    province_vn = shapeName,
    province_code = shapeISO,
    country_code = shapeGroup
  )

# Create a new column 'province_en' based on 'province_code'
provincial_boundaries <- provincial_boundaries %>%
  mutate(province_en = case_when(
    province_code == "VN-44" ~ "An Giang",
    province_code == "VN-43" ~ "BRVT",
    province_code == "VN-54" ~ "Bac Giang",
    province_code == "VN-53" ~ "Bac Kan",
    province_code == "VN-55" ~ "Bac Lieu",
    province_code == "VN-56" ~ "Bac Ninh",
    province_code == "VN-50" ~ "Ben Tre",
    province_code == "VN-31" ~ "Binh Dinh",
    province_code == "VN-57" ~ "Binh Duong",
    province_code == "VN-58" ~ "Binh Phuoc",
    province_code == "VN-40" ~ "Binh Thuan",
    province_code == "VN-59" ~ "Ca Mau",
    province_code == "VN-CT" ~ "Can Tho",
    province_code == "VN-04" ~ "Cao Bang",
    province_code == "VN-DN" ~ "Da Nang",
    province_code == "VN-33" ~ "Dak Lak",
    province_code == "VN-72" ~ "Dak Nong",
    province_code == "VN-71" ~ "Dien Bien",
    province_code == "VN-39" ~ "Dong Nai",
    province_code == "VN-45" ~ "Dong Thap",
    province_code == "VN-30" ~ "Gia Lai",
    province_code == "VN-SG" ~ "HCMC",
    province_code == "VN-03" ~ "Ha Giang",
    province_code == "VN-63" ~ "Ha Nam",
    province_code == "VN-HN" ~ "Ha Noi",
    province_code == "VN-23" ~ "Ha Tinh",
    province_code == "VN-61" ~ "Hai Duong",
    province_code == "VN-HP" ~ "Hai Phong",
    province_code == "VN-73" ~ "Hau Giang",
    province_code == "VN-14" ~ "Hoa Binh",
    province_code == "VN-66" ~ "Hung Yen",
    province_code == "VN-34" ~ "Khanh Hoa",
    province_code == "VN-47" ~ "Kien Giang",
    province_code == "VN-28" ~ "Kon Tum",
    province_code == "VN-01" ~ "Lai Chau",
    province_code == "VN-35" ~ "Lam Dong",
    province_code == "VN-09" ~ "Lang Son",
    province_code == "VN-02" ~ "Lao Cai",
    province_code == "VN-41" ~ "Long An",
    province_code == "VN-67" ~ "Nam Dinh",
    province_code == "VN-22" ~ "Nghe An",
    province_code == "VN-18" ~ "Ninh Binh",
    province_code == "VN-36" ~ "Ninh Thuan",
    province_code == "VN-68" ~ "Phu Tho",
    province_code == "VN-32" ~ "Phu Yen",
    province_code == "VN-24" ~ "Quang Binh",
    province_code == "VN-27" ~ "Quang Nam",
    province_code == "VN-29" ~ "Quang Ngai",
    province_code == "VN-13" ~ "Quang Ninh",
    province_code == "VN-25" ~ "Quang Tri",
    province_code == "VN-52" ~ "Soc Trang",
    province_code == "VN-05" ~ "Son La",
    province_code == "VN-26" ~ "TT-Hue",
    province_code == "VN-37" ~ "Tay Ninh",
    province_code == "VN-20" ~ "Thai Binh",
    province_code == "VN-69" ~ "Thai Nguyen",
    province_code == "VN-21" ~ "Thanh Hoa",
    province_code == "VN-46" ~ "Tien Giang",
    province_code == "VN-51" ~ "Tra Vinh",
    province_code == "VN-07" ~ "Tuyen Quang",
    province_code == "VN-49" ~ "Vinh Long",
    province_code == "VN-70" ~ "Vinh Phuc",
    province_code == "VN-06" ~ "Yen Bai"
  )) %>% 
  select (province_en, everything())

write_rds(provincial_boundaries, "data/rds/provincial_boundaries.rds")
Note

Since Coordinate Reference System of provincial_boundaries

is in 4326 (unit of measurement = degree), we have to transform it

Also, we need to have an english name for each province to allow us to map the province boundary with other dataset

pci_2021 <- read_xlsx("data/provincial_competitiveness_index/2021.xlsx")

pci_2021 <- pci_2021 %>%
  mutate(
    province_code = case_when(
      province_en == "An Giang" ~ "VN-44",
      province_en == "BRVT" ~ "VN-43",
      province_en == "Bac Giang" ~ "VN-54",
      province_en == "Bac Kan" ~ "VN-53",
      province_en == "Bac Lieu" ~ "VN-55",
      province_en == "Bac Ninh" ~ "VN-56",
      province_en == "Ben Tre" ~ "VN-50",
      province_en == "Binh Dinh" ~ "VN-31",
      province_en == "Binh Duong" ~ "VN-57",
      province_en == "Binh Phuoc" ~ "VN-58",
      province_en == "Binh Thuan" ~ "VN-40",
      province_en == "Ca Mau" ~ "VN-59",
      province_en == "Can Tho" ~ "VN-CT",
      province_en == "Cao Bang" ~ "VN-04",
      province_en == "Da Nang" ~ "VN-DN",
      province_en == "Dak Lak" ~ "VN-33",
      province_en == "Dak Nong" ~ "VN-72",
      province_en == "Dien Bien" ~ "VN-71",
      province_en == "Dong Nai" ~ "VN-39",
      province_en == "Dong Thap" ~ "VN-45",
      province_en == "Gia Lai" ~ "VN-30",
      province_en == "HCMC" ~ "VN-SG",
      province_en == "Ha Giang" ~ "VN-03",
      province_en == "Ha Nam" ~ "VN-63",
      province_en == "Ha Noi" ~ "VN-HN",
      province_en == "Ha Tinh" ~ "VN-23",
      province_en == "Hai Duong" ~ "VN-61",
      province_en == "Hai Phong" ~ "VN-HP",
      province_en == "Hau Giang" ~ "VN-73",
      province_en == "Hoa Binh" ~ "VN-14",
      province_en == "Hung Yen" ~ "VN-66",
      province_en == "Khanh Hoa" ~ "VN-34",
      province_en == "Kien Giang" ~ "VN-47",
      province_en == "Kon Tum" ~ "VN-28",
      province_en == "Lai Chau" ~ "VN-01",
      province_en == "Lam Dong" ~ "VN-35",
      province_en == "Lang Son" ~ "VN-09",
      province_en == "Lao Cai" ~ "VN-02",
      province_en == "Long An" ~ "VN-41",
      province_en == "Nam Dinh" ~ "VN-67",
      province_en == "Nghe An" ~ "VN-22",
      province_en == "Ninh Binh" ~ "VN-18",
      province_en == "Ninh Thuan" ~ "VN-36",
      province_en == "Phu Tho" ~ "VN-68",
      province_en == "Phu Yen" ~ "VN-32",
      province_en == "Quang Binh" ~ "VN-24",
      province_en == "Quang Nam" ~ "VN-27",
      province_en == "Quang Ngai" ~ "VN-29",
      province_en == "Quang Ninh" ~ "VN-13",
      province_en == "Quang Tri" ~ "VN-25",
      province_en == "Soc Trang" ~ "VN-52",
      province_en == "Son La" ~ "VN-05",
      province_en == "TT-Hue" ~ "VN-26",
      province_en == "Tay Ninh" ~ "VN-37",
      province_en == "Thai Binh" ~ "VN-20",
      province_en == "Thai Nguyen" ~ "VN-69",
      province_en == "Thanh Hoa" ~ "VN-21",
      province_en == "Tien Giang" ~ "VN-46",
      province_en == "Tra Vinh" ~ "VN-51",
      province_en == "Tuyen Quang" ~ "VN-07",
      province_en == "Vinh Long" ~ "VN-49",
      province_en == "Vinh Phuc" ~ "VN-70",
      province_en == "Yen Bai" ~ "VN-06",
    )
  ) %>%
  select(province_en, province_code, everything())

write.xlsx(pci_2021, "data/rds/pci_2021.xlsx")




fdi <- read_xlsx("data/fdi.xlsx")
# Rename columns
colnames(fdi) <- c("province_en", "total_project_count", 
                   "total_registered_capital")
# Remove the first row
fdi <- fdi[-c(1, 2), ]

fdi <- fdi %>%
  mutate(
    province_code = case_when(
      province_en == "An Giang" ~ "VN-44",
      province_en == "Ba Ria - Vung Tau" ~ "VN-43",
      province_en == "Bac Giang" ~ "VN-54",
      province_en == "Bac Kan" ~ "VN-53",
      province_en == "Bac Lieu" ~ "VN-55",
      province_en == "Bac Ninh" ~ "VN-56",
      province_en == "Ben Tre" ~ "VN-50",
      province_en == "Binh Dinh" ~ "VN-31",
      province_en == "Binh  Duong" ~ "VN-57",
      province_en == "Binh Phuoc" ~ "VN-58",
      province_en == "Binh Thuan" ~ "VN-40",
      province_en == "Ca Mau" ~ "VN-59",
      province_en == "Can Tho" ~ "VN-CT",
      province_en == "Cao Bang" ~ "VN-04",
      province_en == "Da Nang" ~ "VN-DN",
      province_en == "Dak Lak" ~ "VN-33",
      province_en == "Dak Nong" ~ "VN-72",
      province_en == "Dien Bien" ~ "VN-71",
      province_en == "Dong Nai" ~ "VN-39",
      province_en == "Dong Thap" ~ "VN-45",
      province_en == "Gia Lai" ~ "VN-30",
      province_en == "Ho Chi Minh city" ~ "VN-SG",
      province_en == "Ha Giang" ~ "VN-03",
      province_en == "Ha Nam" ~ "VN-63",
      province_en == "Ha Noi" ~ "VN-HN",
      province_en == "Ha Tinh" ~ "VN-23",
      province_en == "Hai Duong" ~ "VN-61",
      province_en == "Hai Phong" ~ "VN-HP",
      province_en == "Hau Giang" ~ "VN-73",
      province_en == "Hoa Binh" ~ "VN-14",
      province_en == "Hung Yen" ~ "VN-66",
      province_en == "Khanh  Hoa" ~ "VN-34",
      province_en == "Kien  Giang" ~ "VN-47",
      province_en == "Kon Tum" ~ "VN-28",
      province_en == "Lai Chau" ~ "VN-01",
      province_en == "Lam Dong" ~ "VN-35",
      province_en == "Lang Son" ~ "VN-09",
      province_en == "Lao Cai" ~ "VN-02",
      province_en == "Long An" ~ "VN-41",
      province_en == "Nam Dinh" ~ "VN-67",
      province_en == "Nghe An" ~ "VN-22",
      province_en == "Ninh Binh" ~ "VN-18",
      province_en == "Ninh  Thuan" ~ "VN-36",
      province_en == "Phu Tho" ~ "VN-68",
      province_en == "Phu Yen" ~ "VN-32",
      province_en == "Quang Binh" ~ "VN-24",
      province_en == "Quang  Nam" ~ "VN-27",
      province_en == "Quang  Ngai" ~ "VN-29",
      province_en == "Quang Ninh" ~ "VN-13",
      province_en == "Quang Tri" ~ "VN-25",
      province_en == "Soc Trang" ~ "VN-52",
      province_en == "Son La" ~ "VN-05",
      province_en == "Thua Thien-Hue" ~ "VN-26",
      province_en == "Tay Ninh" ~ "VN-37",
      province_en == "Thai Binh" ~ "VN-20",
      province_en == "Thai  Nguyen" ~ "VN-69",
      province_en == "Thanh Hoa" ~ "VN-21",
      province_en == "Tien Giang" ~ "VN-46",
      province_en == "Tra Vinh" ~ "VN-51",
      province_en == "Tuyen Quang" ~ "VN-07",
      province_en == "Vinh Long" ~ "VN-49",
      province_en == "Vinh Phuc" ~ "VN-70",
      province_en == "Yen Bai" ~ "VN-06",
    )
  ) %>%
  select(province_en, province_code, everything())

fdi <- fdi %>% 
  left_join(provincial_boundaries, by = "province_code") %>% 
  select(province_en.x, province_code, total_project_count, total_registered_capital, geometry) %>% 
  rename(province_en = province_en.x)

write_rds(fdi, "data/rds/fdi.rds")
Note

PCI_2021 datasets were inconsistent, so I created a new sheet called ‘summary’ and renamed the old one to ‘summary - old’. The new sheet uses the XLOOKUP function for quick data population from the old sheet, which is much faster compared to handling it in R. In R, different sets of code would be required to manage various data types, making the process more time-consuming.

For economy_pie dataset, we have also performed simple data reformatting shown in ‘Summary’ sheet from ‘Summary -old’ sheet



2.0 Importing the clean set of data

provincial_boundaries <- read_rds("data/rds/provincial_boundaries.rds")

pci_2021 <- read_xlsx("data/rds/pci_2021.xlsx")

fdi <- read_rds("data/rds/fdi.rds")



3.0 Prioritization Analysis for Provincial Development: Identifying Key Predictors

3.1 Correlation Matrix

The PCI consists of nine dimensions, each serving as an independent variable with varying degrees of influence on FDI data.

Given that some dimensions may exhibit high correlation with one another, it is essential to identify these correlated pairs and select only one variable from each pair for analysis.

To achieve this, we conduct a correlation matrix to assess the relationships between the dimensions.

ggcorrmat(pci_2021[,4:13])

Note

Interpretation

If any > 0.8 = highly correlated.

We found there isn’t any pair that is highly correlated. We will later reconfirm with the check for [4.6 Checking for multicollinearity].


3.2 Conduct Linear Regression

To explore the influence of each PCI dimension on FDI, we begin with a linear regression model. This initial model will help us determine the relationship between each independent variable (PCI dimensions) and FDI.

By examining the direction and size of each coefficient, we can start to understand the general influence of each dimension. This setup provides a foundation for refining our analysis and identifying key predictors in subsequent steps

pci_2021 <- pci_2021 %>% 
  left_join(fdi %>% 
              select(province_code, total_project_count, total_registered_capital),
            by = "province_code")

pci_2021$total_registered_capital <- as.numeric(as.character(pci_2021$total_registered_capital))

pci_2021$total_project_count <- as.numeric(as.character(pci_2021$total_project_count))
pci_project_mlr <- lm(formula = total_project_count ~ `Entry Costs` + 
                  `Land Access` + Transparency + 
                  `Time Costs` + `Informal charges` + Proactivity + 
                  `Business Support Policy` + `Labor Policy` +
                `Law & Order`,
                data=pci_2021)

ols_regress(pci_project_mlr)
                            Model Summary                             
---------------------------------------------------------------------
R                         0.645       RMSE                  1302.611 
R-Squared                 0.416       MSE                2011017.608 
Adj. R-Squared            0.319       Coef. Var              246.439 
Pred R-Squared            0.125       AIC                   1121.656 
MAE                     825.133       SBC                   1145.404 
---------------------------------------------------------------------
 RMSE: Root Mean Square Error 
 MSE: Mean Square Error 
 MAE: Mean Absolute Error 
 AIC: Akaike Information Criteria 
 SBC: Schwarz Bayesian Criteria 

                                 ANOVA                                   
------------------------------------------------------------------------
                     Sum of                                             
                    Squares        DF    Mean Square      F        Sig. 
------------------------------------------------------------------------
Regression     77389878.935         9    8598875.437    4.276     3e-04 
Residual      108594950.815        54    2011017.608                    
Total         185984829.750        63                                   
------------------------------------------------------------------------

                                             Parameter Estimates                                              
-------------------------------------------------------------------------------------------------------------
                    model        Beta    Std. Error    Std. Beta      t        Sig         lower       upper 
-------------------------------------------------------------------------------------------------------------
              (Intercept)     985.260      4334.255                  0.227    0.821    -7704.398    9674.917 
            `Entry Costs`    -578.238       362.995       -0.181    -1.593    0.117    -1305.998     149.523 
            `Land Access`     219.114       451.280        0.061     0.486    0.629     -685.648    1123.877 
             Transparency    -335.325       326.078       -0.123    -1.028    0.308     -989.073     318.422 
             `Time Costs`     459.403       340.015        0.204     1.351    0.182     -222.286    1141.091 
       `Informal charges`    -453.756       386.112       -0.184    -1.175    0.245    -1227.864     320.351 
              Proactivity    -189.001       393.744       -0.064    -0.480    0.633     -978.409     600.407 
`Business Support Policy`     681.895       246.477        0.310     2.767    0.008      187.738    1176.052 
           `Labor Policy`     846.539       279.910        0.361     3.024    0.004      285.354    1407.724 
            `Law & Order`    -632.826       443.014       -0.210    -1.428    0.159    -1521.015     255.363 
-------------------------------------------------------------------------------------------------------------
tbl_regression(pci_project_mlr, 
               intercept = TRUE) %>% 
  add_glance_source_note(
    label = list(sigma ~ "\U03C3"),
    include = c(r.squared, adj.r.squared, 
                AIC, statistic,
                p.value, sigma))
Characteristic Beta 95% CI1 p-value
(Intercept) 985 -7,704, 9,675 0.8
Entry Costs -578 -1,306, 150 0.12
Land Access 219 -686, 1,124 0.6
Transparency -335 -989, 318 0.3
Time Costs 459 -222, 1,141 0.2
Informal charges -454 -1,228, 320 0.2
Proactivity -189 -978, 600 0.6
Business Support Policy 682 188, 1,176 0.008
Labor Policy 847 285, 1,408 0.004
Law & Order -633 -1,521, 255 0.2
R² = 0.416; Adjusted R² = 0.319; AIC = 1,122; Statistic = 4.28; p-value = <0.001; σ = 1,418
1 CI = Confidence Interval
Note

Model Summary

R-Squared of 0.319, indicating that approximately 31.9% of the variation in FDI total number of projects can be accounted for by the independent variables, adjusting for the number of predictors in the model.

ANOVA -Analysis of Variance (F test)

F-ratio of 4.276 -> is significant at p < 0.001. Hence, our regression model is statistically significant, suggesting that at least some fo the PCI dimensions meaningfully contribute to predicting FDI total number of projects

The model summary and ANOVA results reveal that while the overall model has a moderate level of predictive power, with some independent variables (such as Business Support Policy and Labor Policy) showing significant contributions, others (like Entry Costs and Transparency) did not demonstrate strong effects.

pci_capital_mlr <- lm(formula = total_registered_capital ~ `Entry Costs` + 
                  `Land Access` + Transparency + 
                  `Time Costs` + `Informal charges` + Proactivity + 
                  `Business Support Policy` + `Labor Policy` +
                `Law & Order`,
                data=pci_2021)

ols_regress(pci_capital_mlr)
                             Model Summary                              
-----------------------------------------------------------------------
R                          0.739       RMSE                   7912.959 
R-Squared                  0.546       MSE                74210280.110 
Adj. R-Squared             0.470       Coef. Var               117.038 
Pred R-Squared             0.373       AIC                    1352.585 
MAE                     6437.884       SBC                    1376.333 
-----------------------------------------------------------------------
 RMSE: Root Mean Square Error 
 MSE: Mean Square Error 
 MAE: Mean Absolute Error 
 AIC: Akaike Information Criteria 
 SBC: Schwarz Bayesian Criteria 

                                   ANOVA                                    
---------------------------------------------------------------------------
                      Sum of                                               
                     Squares        DF      Mean Square      F        Sig. 
---------------------------------------------------------------------------
Regression    4821217884.446         9    535690876.050    7.219    0.0000 
Residual      4007355125.955        54     74210280.110                    
Total         8828573010.401        63                                     
---------------------------------------------------------------------------

                                               Parameter Estimates                                                
-----------------------------------------------------------------------------------------------------------------
                    model          Beta    Std. Error    Std. Beta      t        Sig          lower        upper 
-----------------------------------------------------------------------------------------------------------------
              (Intercept)    -32220.037     26329.252                 -1.224    0.226    -85007.009    20566.936 
            `Entry Costs`     -3738.724      2205.079       -0.170    -1.696    0.096     -8159.642      682.194 
            `Land Access`      2393.728      2741.389        0.097     0.873    0.386     -3102.425     7889.882 
             Transparency       252.004      1980.823        0.013     0.127    0.899     -3719.308     4223.316 
             `Time Costs`      5067.961      2065.484        0.326     2.454    0.017       926.914     9209.008 
       `Informal charges`     -3828.241      2345.509       -0.226    -1.632    0.108     -8530.704      874.222 
              Proactivity     -2310.748      2391.870       -0.113    -0.966    0.338     -7106.159     2484.663 
`Business Support Policy`      5247.362      1497.272        0.347     3.505    0.001      2245.512     8249.211 
           `Labor Policy`      6748.025      1700.364        0.418     3.969    0.000      3339.000    10157.050 
            `Law & Order`     -3233.972      2691.170       -0.155    -1.202    0.235     -8629.443     2161.499 
-----------------------------------------------------------------------------------------------------------------
tbl_regression(pci_capital_mlr, 
               intercept = TRUE) %>% 
  add_glance_source_note(
    label = list(sigma ~ "\U03C3"),
    include = c(r.squared, adj.r.squared, 
                AIC, statistic,
                p.value, sigma))
Characteristic Beta 95% CI1 p-value
(Intercept) -32,220 -85,007, 20,567 0.2
Entry Costs -3,739 -8,160, 682 0.10
Land Access 2,394 -3,102, 7,890 0.4
Transparency 252 -3,719, 4,223 0.9
Time Costs 5,068 927, 9,209 0.017
Informal charges -3,828 -8,531, 874 0.11
Proactivity -2,311 -7,106, 2,485 0.3
Business Support Policy 5,247 2,246, 8,249 <0.001
Labor Policy 6,748 3,339, 10,157 <0.001
Law & Order -3,234 -8,629, 2,161 0.2
R² = 0.546; Adjusted R² = 0.470; AIC = 1,353; Statistic = 7.22; p-value = <0.001; σ = 8,615
1 CI = Confidence Interval
Note

Model Summary
Adjusted R-Squared of 0.470, indicating that approximately 47.0% of the variation in FDI total registered capital can be accounted for by the independent variables, adjusting for the number of predictors in the model.

ANOVA - Analysis of Variance (F test)
F-ratio of 7.219 -> is significant at p < 0.001. Hence, our regression model is statistically significant, suggesting that at least some of the PCI dimensions meaningfully contribute to predicting FDI total registered capital.

This model summary and ANOVA indicate that the model has strong predictive power for explaining FDI based on the given PCI dimensions. The significance of specific predictors, such as Business Support Policy, Labor Policy, and Time Costs, suggests that these are influential variables in explaining FDI total registered capital.

Summary of Findings

For the total number of FDI projects, the model has an Adjusted R-Squared of 0.319, indicating that approximately 31.9% of the variation can be explained by the independent variables. The ANOVA results show a significant F-ratio of 4.276 (p < 0.001), suggesting that some PCI dimensions significantly contribute to the model. However, not all predictors, such as Entry Costs and Transparency, had strong effects.

In contrast, the model for total registered capital has a higher Adjusted R-Squared of 0.470, indicating that 47.0% of the variation is accounted for by the predictors. The ANOVA shows a significant F-ratio of 7.219 (p < 0.001), confirming the model’s strength. Key variables like Business Support Policy, Labor Policy, and Time Costs are particularly influential in explaining FDI total registered capital.


3.3 Run model to Select Independent variable

I will now run different stepwise regression models to further investigate the specific positive and negative impacts of the independent variables on FDI.

This analysis will allow us to quantify how much a 1-unit increase in each independent variable is expected to influence FDI, providing clearer insights into their contributions to both the total number of projects and total registered capital.

All of the model will be making use of the base model formulated from [4.2 Conduct Linear Regression]

pci_project_fw_mlr <- ols_step_forward_p(
  pci_project_mlr, # this is the model
  p_val = 0.05,
  details = FALSE)

pci_project_fw_mlr

                                      Stepwise Summary                                      
------------------------------------------------------------------------------------------
Step    Variable                       AIC         SBC        SBIC        R2       Adj. R2 
------------------------------------------------------------------------------------------
 0      Base Model                   1138.091    1142.409    955.661    0.00000    0.00000 
 1      `Business Support Policy`    1127.319    1133.796    945.026    0.18091    0.16770 
 2      `Labor Policy`               1121.525    1130.161    939.623    0.27482    0.25105 
 3      `Law & Order`                1117.240    1128.035    936.032    0.34265    0.30979 
------------------------------------------------------------------------------------------

Final Model Output 
------------------

                            Model Summary                             
---------------------------------------------------------------------
R                         0.585       RMSE                  1382.121 
R-Squared                 0.343       MSE                2037609.764 
Adj. R-Squared            0.310       Coef. Var              248.063 
Pred R-Squared            0.162       AIC                   1117.240 
MAE                     837.721       SBC                   1128.035 
---------------------------------------------------------------------
 RMSE: Root Mean Square Error 
 MSE: Mean Square Error 
 MAE: Mean Absolute Error 
 AIC: Akaike Information Criteria 
 SBC: Schwarz Bayesian Criteria 

                                  ANOVA                                    
--------------------------------------------------------------------------
                     Sum of                                               
                    Squares        DF     Mean Square      F         Sig. 
--------------------------------------------------------------------------
Regression     63728243.939         3    21242747.980    10.425    0.0000 
Residual      122256585.811        60     2037609.764                     
Total         185984829.750        63                                     
--------------------------------------------------------------------------

                                             Parameter Estimates                                               
--------------------------------------------------------------------------------------------------------------
                    model         Beta    Std. Error    Std. Beta      t        Sig         lower       upper 
--------------------------------------------------------------------------------------------------------------
              (Intercept)    -3908.420      2946.943                 -1.326    0.190    -9803.183    1986.343 
`Business Support Policy`      752.745       234.838        0.343     3.205    0.002      282.998    1222.492 
           `Labor Policy`      883.432       255.821        0.377     3.453    0.001      371.713    1395.151 
            `Law & Order`     -815.756       327.846       -0.270    -2.488    0.016    -1471.546    -159.967 
--------------------------------------------------------------------------------------------------------------
plot(pci_project_fw_mlr)

# fig-width: 12
# fig-height: 10

pci_project_bw_mlr <- ols_step_backward_p(
  pci_project_mlr, # this is the model
  p_val = 0.05,
  details = FALSE)

pci_project_bw_mlr

                                  Stepwise Summary                                   
-----------------------------------------------------------------------------------
Step    Variable                AIC         SBC        SBIC        R2       Adj. R2 
-----------------------------------------------------------------------------------
 0      Full Model            1121.656    1145.404    943.667    0.41611    0.31879 
 1      Proactivity           1119.929    1141.518    941.482    0.41362    0.32833 
 2      `Land Access`         1118.159    1137.589    939.287    0.41151    0.33794 
 3      `Informal charges`    1117.405    1134.676    937.879    0.39994    0.33677 
 4      `Time Costs`          1116.714    1131.826    936.614    0.38754    0.33474 
 5      Transparency          1116.758    1129.712    936.060    0.36766    0.32478 
 6      `Entry Costs`         1117.240    1128.035    936.032    0.34265    0.30979 
-----------------------------------------------------------------------------------

Final Model Output 
------------------

                            Model Summary                             
---------------------------------------------------------------------
R                         0.585       RMSE                  1382.121 
R-Squared                 0.343       MSE                2037609.764 
Adj. R-Squared            0.310       Coef. Var              248.063 
Pred R-Squared            0.162       AIC                   1117.240 
MAE                     837.721       SBC                   1128.035 
---------------------------------------------------------------------
 RMSE: Root Mean Square Error 
 MSE: Mean Square Error 
 MAE: Mean Absolute Error 
 AIC: Akaike Information Criteria 
 SBC: Schwarz Bayesian Criteria 

                                  ANOVA                                    
--------------------------------------------------------------------------
                     Sum of                                               
                    Squares        DF     Mean Square      F         Sig. 
--------------------------------------------------------------------------
Regression     63728243.939         3    21242747.980    10.425    0.0000 
Residual      122256585.811        60     2037609.764                     
Total         185984829.750        63                                     
--------------------------------------------------------------------------

                                             Parameter Estimates                                               
--------------------------------------------------------------------------------------------------------------
                    model         Beta    Std. Error    Std. Beta      t        Sig         lower       upper 
--------------------------------------------------------------------------------------------------------------
              (Intercept)    -3908.420      2946.943                 -1.326    0.190    -9803.183    1986.343 
`Business Support Policy`      752.745       234.838        0.343     3.205    0.002      282.998    1222.492 
           `Labor Policy`      883.432       255.821        0.377     3.453    0.001      371.713    1395.151 
            `Law & Order`     -815.756       327.846       -0.270    -2.488    0.016    -1471.546    -159.967 
--------------------------------------------------------------------------------------------------------------
plot(pci_project_bw_mlr)

# fig-width: 12
# fig-height: 10

pci_project_sb_mlr <- ols_step_both_p(
  pci_project_mlr, # this is the model
  p_val = 0.05,
  details = FALSE)

pci_project_sb_mlr

                                        Stepwise Summary                                        
----------------------------------------------------------------------------------------------
Step    Variable                           AIC         SBC        SBIC        R2       Adj. R2 
----------------------------------------------------------------------------------------------
 0      Base Model                       1138.091    1142.409    955.661    0.00000    0.00000 
 1      `Business Support Policy` (+)    1127.319    1133.796    945.026    0.18091    0.16770 
 2      `Labor Policy` (+)               1121.525    1130.161    939.623    0.27482    0.25105 
 3      `Law & Order` (+)                1117.240    1128.035    936.032    0.34265    0.30979 
----------------------------------------------------------------------------------------------

Final Model Output 
------------------

                            Model Summary                             
---------------------------------------------------------------------
R                         0.585       RMSE                  1382.121 
R-Squared                 0.343       MSE                2037609.764 
Adj. R-Squared            0.310       Coef. Var              248.063 
Pred R-Squared            0.162       AIC                   1117.240 
MAE                     837.721       SBC                   1128.035 
---------------------------------------------------------------------
 RMSE: Root Mean Square Error 
 MSE: Mean Square Error 
 MAE: Mean Absolute Error 
 AIC: Akaike Information Criteria 
 SBC: Schwarz Bayesian Criteria 

                                  ANOVA                                    
--------------------------------------------------------------------------
                     Sum of                                               
                    Squares        DF     Mean Square      F         Sig. 
--------------------------------------------------------------------------
Regression     63728243.939         3    21242747.980    10.425    0.0000 
Residual      122256585.811        60     2037609.764                     
Total         185984829.750        63                                     
--------------------------------------------------------------------------

                                             Parameter Estimates                                               
--------------------------------------------------------------------------------------------------------------
                    model         Beta    Std. Error    Std. Beta      t        Sig         lower       upper 
--------------------------------------------------------------------------------------------------------------
              (Intercept)    -3908.420      2946.943                 -1.326    0.190    -9803.183    1986.343 
`Business Support Policy`      752.745       234.838        0.343     3.205    0.002      282.998    1222.492 
           `Labor Policy`      883.432       255.821        0.377     3.453    0.001      371.713    1395.151 
            `Law & Order`     -815.756       327.846       -0.270    -2.488    0.016    -1471.546    -159.967 
--------------------------------------------------------------------------------------------------------------
plot(pci_project_sb_mlr)

pci_capital_fw_mlr <- ols_step_forward_p(
  pci_capital_mlr, # this is the model
  p_val = 0.05,
  details = FALSE)

pci_capital_fw_mlr

                                      Stepwise Summary                                       
-------------------------------------------------------------------------------------------
Step    Variable                       AIC         SBC         SBIC        R2       Adj. R2 
-------------------------------------------------------------------------------------------
 0      Base Model                   1385.136    1389.454    1202.161    0.00000    0.00000 
 1      `Business Support Policy`    1367.982    1374.459    1185.110    0.25865    0.24669 
 2      `Labor Policy`               1355.344    1363.979    1173.177    0.41022    0.39089 
 3      `Law & Order`                1351.950    1362.744    1170.264    0.45789    0.43078 
-------------------------------------------------------------------------------------------

Final Model Output 
------------------

                             Model Summary                              
-----------------------------------------------------------------------
R                          0.677       RMSE                   8647.671 
R-Squared                  0.458       MSE                79767687.852 
Adj. R-Squared             0.431       Coef. Var               121.341 
Pred R-Squared             0.371       AIC                    1351.950 
MAE                     6737.299       SBC                    1362.744 
-----------------------------------------------------------------------
 RMSE: Root Mean Square Error 
 MSE: Mean Square Error 
 MAE: Mean Absolute Error 
 AIC: Akaike Information Criteria 
 SBC: Schwarz Bayesian Criteria 

                                    ANOVA                                     
-----------------------------------------------------------------------------
                      Sum of                                                 
                     Squares        DF       Mean Square      F         Sig. 
-----------------------------------------------------------------------------
Regression    4042511739.270         3    1347503913.090    16.893    0.0000 
Residual      4786061271.131        60      79767687.852                     
Total         8828573010.401        63                                       
-----------------------------------------------------------------------------

                                               Parameter Estimates                                                
-----------------------------------------------------------------------------------------------------------------
                    model          Beta    Std. Error    Std. Beta      t        Sig          lower        upper 
-----------------------------------------------------------------------------------------------------------------
              (Intercept)    -44762.474     18438.462                 -2.428    0.018    -81644.889    -7880.059 
`Business Support Policy`      6351.523      1469.340        0.420     4.323    0.000      3412.406     9290.640 
           `Labor Policy`      7262.798      1600.626        0.450     4.537    0.000      4061.069    10464.526 
            `Law & Order`     -4711.518      2051.268       -0.227    -2.297    0.025     -8814.666     -608.370 
-----------------------------------------------------------------------------------------------------------------
plot(pci_capital_fw_mlr)

# fig-width: 12
# fig-height: 10

pci_capital_bw_mlr <- ols_step_backward_p(
  pci_capital_mlr, # this is the model
  p_val = 0.05,
  details = FALSE)

pci_capital_bw_mlr

                                   Stepwise Summary                                   
------------------------------------------------------------------------------------
Step    Variable                AIC         SBC         SBIC        R2       Adj. R2 
------------------------------------------------------------------------------------
 0      Full Model            1352.585    1376.333    1174.596    0.54609    0.47044 
 1      Transparency          1350.604    1372.193    1172.239    0.54596    0.47991 
 2      `Land Access`         1349.491    1368.921    1170.506    0.53962    0.48208 
 3      Proactivity           1348.523    1365.794    1168.952    0.53214    0.48289 
 4      `Informal charges`    1348.492    1363.604    1168.222    0.51752    0.47593 
 5      `Entry Costs`         1350.489    1363.443    1169.336    0.48642    0.45161 
 6      `Time Costs`          1351.950    1362.744    1170.264    0.45789    0.43078 
------------------------------------------------------------------------------------

Final Model Output 
------------------

                             Model Summary                              
-----------------------------------------------------------------------
R                          0.677       RMSE                   8647.671 
R-Squared                  0.458       MSE                79767687.852 
Adj. R-Squared             0.431       Coef. Var               121.341 
Pred R-Squared             0.371       AIC                    1351.950 
MAE                     6737.299       SBC                    1362.744 
-----------------------------------------------------------------------
 RMSE: Root Mean Square Error 
 MSE: Mean Square Error 
 MAE: Mean Absolute Error 
 AIC: Akaike Information Criteria 
 SBC: Schwarz Bayesian Criteria 

                                    ANOVA                                     
-----------------------------------------------------------------------------
                      Sum of                                                 
                     Squares        DF       Mean Square      F         Sig. 
-----------------------------------------------------------------------------
Regression    4042511739.270         3    1347503913.090    16.893    0.0000 
Residual      4786061271.131        60      79767687.852                     
Total         8828573010.401        63                                       
-----------------------------------------------------------------------------

                                               Parameter Estimates                                                
-----------------------------------------------------------------------------------------------------------------
                    model          Beta    Std. Error    Std. Beta      t        Sig          lower        upper 
-----------------------------------------------------------------------------------------------------------------
              (Intercept)    -44762.474     18438.462                 -2.428    0.018    -81644.889    -7880.059 
`Business Support Policy`      6351.523      1469.340        0.420     4.323    0.000      3412.406     9290.640 
           `Labor Policy`      7262.798      1600.626        0.450     4.537    0.000      4061.069    10464.526 
            `Law & Order`     -4711.518      2051.268       -0.227    -2.297    0.025     -8814.666     -608.370 
-----------------------------------------------------------------------------------------------------------------
plot(pci_capital_bw_mlr)

# fig-width: 12
# fig-height: 10

pci_capital_sb_mlr <- ols_step_both_p(
  pci_capital_mlr, # this is the model
  p_val = 0.05,
  details = FALSE)

pci_capital_sb_mlr

                                        Stepwise Summary                                         
-----------------------------------------------------------------------------------------------
Step    Variable                           AIC         SBC         SBIC        R2       Adj. R2 
-----------------------------------------------------------------------------------------------
 0      Base Model                       1385.136    1389.454    1202.161    0.00000    0.00000 
 1      `Business Support Policy` (+)    1367.982    1374.459    1185.110    0.25865    0.24669 
 2      `Labor Policy` (+)               1355.344    1363.979    1173.177    0.41022    0.39089 
 3      `Law & Order` (+)                1351.950    1362.744    1170.264    0.45789    0.43078 
 4      `Time Costs` (+)                 1350.489    1363.443    1169.336    0.48642    0.45161 
 5      `Entry Costs` (+)                1348.492    1363.604    1168.222    0.51752    0.47593 
-----------------------------------------------------------------------------------------------

Final Model Output 
------------------

                             Model Summary                              
-----------------------------------------------------------------------
R                          0.719       RMSE                   8158.221 
R-Squared                  0.518       MSE                73441733.084 
Adj. R-Squared             0.476       Coef. Var               116.430 
Pred R-Squared             0.414       AIC                    1348.492 
MAE                     6522.401       SBC                    1363.604 
-----------------------------------------------------------------------
 RMSE: Root Mean Square Error 
 MSE: Mean Square Error 
 MAE: Mean Absolute Error 
 AIC: Akaike Information Criteria 
 SBC: Schwarz Bayesian Criteria 

                                   ANOVA                                     
----------------------------------------------------------------------------
                      Sum of                                                
                     Squares        DF      Mean Square      F         Sig. 
----------------------------------------------------------------------------
Regression    4568952491.502         5    913790498.300    12.442    0.0000 
Residual      4259620518.899        58     73441733.084                     
Total         8828573010.401        63                                      
----------------------------------------------------------------------------

                                              Parameter Estimates                                                
----------------------------------------------------------------------------------------------------------------
                    model          Beta    Std. Error    Std. Beta      t        Sig          lower       upper 
----------------------------------------------------------------------------------------------------------------
              (Intercept)    -30495.886     19965.179                 -1.527    0.132    -70460.535    9468.762 
`Business Support Policy`      5572.081      1467.377        0.368     3.797    0.000      2634.807    8509.355 
           `Labor Policy`      6085.465      1630.387        0.377     3.733    0.000      2821.892    9349.039 
            `Law & Order`     -4842.805      2152.763       -0.233    -2.250    0.028     -9152.029    -533.581 
             `Time Costs`      3741.369      1713.689        0.241     2.183    0.033       311.048    7171.689 
            `Entry Costs`     -4197.282      2170.954       -0.191    -1.933    0.058     -8542.918     148.354 
----------------------------------------------------------------------------------------------------------------
plot(pci_capital_sb_mlr)


3.4 Model Selection

Next, we will utilize a radar chart to visualize the performance of the different models.

The model that has the most edges touching the outer boundary is considered the best performer, indicating stronger overall results across the evaluated metrics.

project_metric <- compare_performance(pci_project_mlr,
                              pci_project_fw_mlr$model,
                              pci_project_bw_mlr$model,
                              pci_project_sb_mlr$model)
Some of the nested models seem to be identical
project_metric$Name <- gsub(".*\\\\([a-zA-Z0-9_]+)\\\\, \\\\model\\\\.*", "\\1", project_metric$Name)

# plot radar
plot(project_metric)

capital_metric <- compare_performance(pci_capital_mlr,
                              pci_capital_fw_mlr$model,
                              pci_capital_bw_mlr$model,
                              pci_capital_sb_mlr$model)

capital_metric$Name <- gsub(".*\\\\([a-zA-Z0-9_]+)\\\\, \\\\model\\\\.*", "\\1", capital_metric$Name)

# plot radar
plot(capital_metric)

Note

For predicting the total number of projects, the best-performing model is pci_project_sb_mlr).

In contrast, for predicting total registered capital, the best-performing model is pci_capital_sb_mlr.


3.5 Visualize model parameters

We will now utilize the best-performing model to quantify the exact positive or negative impact (in numerical terms) that a one-unit change in the independent variables will have.

ggcoefstats(pci_project_sb_mlr$model,
            sort = "ascending")

ggcoefstats(pci_capital_sb_mlr$model,
            sort = "ascending")

Note

To enhance the attraction of Foreign Direct Investment (FDI) projects and registered capital, policymakers should prioritize improvements in Labor Policy and Business Support Policy.

  • For every single unit increase in Labor Policy, there is a positive influence on attracting more FDI projects and increasing registered capital.

  • Similarly, an increase in Business Support Policy also contributes positively to both the total number of FDI projects and the registered capital.

Focusing on these two policy areas will significantly bolster efforts to attract more FDI.


3.6 Checking for multicollinearity

We will now confirm our Correlation Matrix by looking at the Variance Inflation Factor (VIF)

Interpretation

  • < 5: low multicollinearity

  • 5-10: moderate multcollinearity

  • >10: strong multicollinearity

check_collinearity(pci_project_sb_mlr$model)
# Check for Multicollinearity

Low Correlation

                    Term  VIF    VIF 95% CI Increased SE Tolerance
 Business Support Policy 1.04 [1.00, 12.89]         1.02      0.96
            Labor Policy 1.09 [1.00,  2.65]         1.04      0.92
             Law & Order 1.08 [1.00,  3.13]         1.04      0.93
 Tolerance 95% CI
     [0.08, 1.00]
     [0.38, 1.00]
     [0.32, 1.00]
plot(check_collinearity(pci_project_sb_mlr$model)) +
  theme(axis.text.x = element_text(
    angle = 45, 
    hjust = 1
  ))
Variable `Component` is not in your data frame :/

check_collinearity(pci_capital_sb_mlr$model)
# Check for Multicollinearity

Low Correlation

                    Term  VIF   VIF 95% CI Increased SE Tolerance
 Business Support Policy 1.13 [1.02, 2.03]         1.06      0.89
            Labor Policy 1.23 [1.06, 1.89]         1.11      0.81
             Law & Order 1.29 [1.09, 1.93]         1.13      0.78
              Time Costs 1.46 [1.19, 2.13]         1.21      0.68
             Entry Costs 1.17 [1.03, 1.91]         1.08      0.85
 Tolerance 95% CI
     [0.49, 0.98]
     [0.53, 0.95]
     [0.52, 0.92]
     [0.47, 0.84]
     [0.52, 0.97]
plot(check_collinearity(pci_capital_sb_mlr$model)) +
  theme(axis.text.x = element_text(
    angle = 45, 
    hjust = 1
  ))
Variable `Component` is not in your data frame :/

Note

There is no Multicollinearity found in both the model used for Total Number of projects and Total Registered Capital


3.7 Linearity Assumption Test

project_out <- plot(check_model(pci_project_sb_mlr$model,
                        panel = FALSE))
For confidence bands, please install `qqplotr`.
project_out[[2]]

capital_out <- plot(check_model(pci_capital_sb_mlr$model,
                        panel = FALSE))
For confidence bands, please install `qqplotr`.
capital_out[[2]]


3.8 Normality Assumption Test

plot(check_normality(pci_project_sb_mlr$model))
For confidence bands, please install `qqplotr`.

plot(check_normality(pci_capital_sb_mlr$model))
For confidence bands, please install `qqplotr`.


3.9 Checking of outliers

project_outliers <- check_outliers(pci_project_sb_mlr$model,
                           method = "cook")

project_outliers
1 outlier detected: case 30.
- Based on the following method and threshold: cook (0.849).
- For variable: (Whole model).
plot(project_outliers <- check_outliers(pci_project_sb_mlr$model,
                           method = "cook"))

capital_outliers <- check_outliers(pci_capital_sb_mlr$model,
                           method = "cook")

capital_outliers
OK: No outliers detected.
- Based on the following method and threshold: cook (0.902).
- For variable: (Whole model)
plot(capital_outliers <- check_outliers(pci_capital_sb_mlr$model,
                           method = "cook"))

Note

After conducting the tests, I can conclude that both the models used for the Total Number of Projects and Total Registered Capital meet the necessary assumptions and successfully pass the tests.



4.0 Spatial Non-Stationary Assumption

project_mlr_output <- as.data.frame(pci_project_sb_mlr$model$residuals) %>% 
  rename(`SB_MLR_RES` = `pci_project_sb_mlr$model$residuals`)

# join the newly created data frame
project_fdi_sf <- cbind(provincial_boundaries, 
                        project_mlr_output$SB_MLR_RES) %>%
  rename(`MLR_RES` = `project_mlr_output.SB_MLR_RES`)

tmap_mode("view")
tmap mode set to interactive viewing
tm_shape(provincial_boundaries)+
  tmap_options(check.and.fix = TRUE) +
  tm_polygons(alpha = 0.4) +
tm_shape(project_fdi_sf) +  
  tm_polygons(col = "MLR_RES",
          alpha = 0.6,
          size = 0.3,
          style="quantile") 
Warning: The shape provincial_boundaries is invalid (after reprojection). See
sf::st_is_valid
Warning: The shape project_fdi_sf is invalid (after reprojection). See
sf::st_is_valid
Variable(s) "MLR_RES" contains positive and negative values, so midpoint is set to 0. Set midpoint = NA to show the full spectrum of the color palette.
tmap_mode("plot")
tmap mode set to plotting
# compute the distance-based weight matrix by using dnearneigh() function of spdep.
project_fdi_sf <- project_fdi_sf %>%
  mutate(nb = st_knn(geometry, k=6,
                     longlat = FALSE),
         wt = st_weights(nb,
                         style = "W"),
         .before = 1)
! Polygon provided. Using point on surface.
# global moran_perm 
global_moran_perm(project_fdi_sf$MLR_RES, 
                  project_fdi_sf$nb, 
                  project_fdi_sf$wt, 
                  alternative = "two.sided", 
                  nsim = 999)

    Monte-Carlo simulation of Moran I

data:  x 
weights: listw  
number of simulations + 1: 1000 

statistic = -0.022142, observed rank = 500, p-value = 1
alternative hypothesis: two.sided
capital_mlr_output <- as.data.frame(pci_capital_sb_mlr$model$residuals) %>% 
  rename(`SB_MLR_RES` = `pci_capital_sb_mlr$model$residuals`)

# join the newly created data frame
capital_fdi_sf <- cbind(provincial_boundaries, 
                        capital_mlr_output$SB_MLR_RES) %>%
  rename(`MLR_RES` = `capital_mlr_output.SB_MLR_RES`)

tmap_mode("view")
tmap mode set to interactive viewing
tm_shape(provincial_boundaries)+
  tmap_options(check.and.fix = TRUE) +
  tm_polygons(alpha = 0.4) +
tm_shape(capital_fdi_sf) +  
  tm_polygons(col = "MLR_RES",
          alpha = 0.6,
          size = 0.3,
          style="quantile") 
Warning: The shape provincial_boundaries is invalid (after reprojection). See
sf::st_is_valid
Warning: The shape capital_fdi_sf is invalid (after reprojection). See
sf::st_is_valid
Variable(s) "MLR_RES" contains positive and negative values, so midpoint is set to 0. Set midpoint = NA to show the full spectrum of the color palette.
tmap_mode("plot")
tmap mode set to plotting
# compute the distance-based weight matrix by using dnearneigh() function of spdep.
capital_fdi_sf <- capital_fdi_sf %>%
  mutate(nb = st_knn(geometry, k=6,
                     longlat = FALSE),
         wt = st_weights(nb,
                         style = "W"),
         .before = 1)
! Polygon provided. Using point on surface.
# global moran_perm 
global_moran_perm(capital_fdi_sf$MLR_RES, 
                  capital_fdi_sf$nb, 
                  capital_fdi_sf$wt, 
                  alternative = "two.sided", 
                  nsim = 999)

    Monte-Carlo simulation of Moran I

data:  x 
weights: listw  
number of simulations + 1: 1000 

statistic = -0.016383, observed rank = 525, p-value = 0.95
alternative hypothesis: two.sided
Note

Based on the results of the Moran’s I tests, I can conclude that there is no evidence of significant spatial autocorrelation in the data for either model. This suggests that the distribution of variable does not show systematic clustering or dispersion across the studied area.



5.0 Local

Preparing the data

pci_2021 <- pci_2021 %>% 
  left_join(provincial_boundaries %>% 
              select(province_code, geometry), 
            by = "province_code") %>% 
  st_as_sf()
Warning in left_join(., provincial_boundaries %>% select(province_code, : Detected an unexpected many-to-many relationship between `x` and `y`.
ℹ Row 11 of `x` matches multiple rows in `y`.
ℹ Row 2 of `y` matches multiple rows in `x`.
ℹ If a many-to-many relationship is expected, set `relationship =
  "many-to-many"` to silence this warning.

Fixed VS Adaptive Bandwidth

Note

After conducting the comparison, we observed that the best-performing selection using Fixed bandwidth outperforms the top selection from Adaptive bandwidth. Additionally, among the various approaches and kernels, the ‘CV’ approach combined with the ‘boxcar’ kernel consistently delivered the best performance, yielding the highest Adjusted R². This indicates that Fixed bandwidth, along with the ‘CV’ approach and ‘boxcar’ kernel, offers the most effective configuration for this analysis.

Note

After evaluating the various approaches and kernel types, we found that there were no differences in performance across the different approaches.

However, when it came to kernel selection, the ‘boxcar’ kernel consistently outperformed the others, achieving the highest Adjusted R². This suggests that, for this particular dataset and modeling context, the ‘boxcar’ kernel offers the most effective fit.

bw.fixed_project <- bw.gwr(formula = total_project_count ~ 
                             `Entry Costs` + `Land Access` + Transparency + 
                             `Time Costs` + `Informal charges` + Proactivity + 
                             `Business Support Policy` + `Labor Policy` +
                             `Law & Order`,
                           data=pci_2021,
                           approach="CV", 
                           kernel="boxcar", 
                           adaptive=FALSE, 
                           longlat=FALSE)
Fixed bandwidth: 968784.2 CV score: 222394245 
Fixed bandwidth: 598861.3 CV score: 253731784 
Fixed bandwidth: 1197409 CV score: 166709145 
Fixed bandwidth: 1338707 CV score: 171205597 
Fixed bandwidth: 1110082 CV score: 208175729 
Fixed bandwidth: 1251380 CV score: 171769613 
Fixed bandwidth: 1164053 CV score: 161667008 
Fixed bandwidth: 1143438 CV score: 162515312 
Fixed bandwidth: 1176794 CV score: 165824047 
Fixed bandwidth: 1156179 CV score: 161322847 
Fixed bandwidth: 1151312 CV score: 161298848 
Fixed bandwidth: 1148305 CV score: 160623353 
Fixed bandwidth: 1146446 CV score: 163009911 
Fixed bandwidth: 1149453 CV score: 160891544 
Fixed bandwidth: 1147595 CV score: 160623353 
gwr.fixed_project <- gwr.basic(formula = total_project_count ~ 
                                 `Entry Costs` +  `Land Access` + Transparency + 
                                 `Time Costs` + `Informal charges` + Proactivity + 
                                 `Business Support Policy` + `Labor Policy` +
                                 `Law & Order`,
                               data=pci_2021,
                               bw=bw.fixed_project, 
                               kernel = 'boxcar', 
                               longlat = FALSE)

gwr.fixed_project
   ***********************************************************************
   *                       Package   GWmodel                             *
   ***********************************************************************
   Program starts at: 2024-11-12 17:10:52.878684 
   Call:
   gwr.basic(formula = total_project_count ~ `Entry Costs` + `Land Access` + 
    Transparency + `Time Costs` + `Informal charges` + Proactivity + 
    `Business Support Policy` + `Labor Policy` + `Law & Order`, 
    data = pci_2021, bw = bw.fixed_project, kernel = "boxcar", 
    longlat = FALSE)

   Dependent (y) variable:  total_project_count
   Independent variables:  Entry Costs Land Access Transparency Time Costs Informal charges Proactivity Business Support Policy Labor Policy Law & Order
   Number of data points: 66
   ***********************************************************************
   *                    Results of Global Regression                     *
   ***********************************************************************

   Call:
    lm(formula = formula, data = data)

   Residuals:
    Min      1Q  Median      3Q     Max 
-1500.4  -720.6  -243.1   295.9  7609.3 

   Coefficients:
                             Estimate Std. Error t value Pr(>|t|)   
   (Intercept)                 1303.9     4351.0   0.300  0.76553   
   `Entry Costs`               -499.2      361.4  -1.381  0.17264   
   `Land Access`                250.5      453.1   0.553  0.58261   
   Transparency                -360.6      327.3  -1.102  0.27531   
   `Time Costs`                 411.7      340.4   1.210  0.23152   
   `Informal charges`          -454.6      388.0  -1.172  0.24628   
   Proactivity                 -142.3      394.6  -0.361  0.71970   
   `Business Support Policy`    611.7      243.7   2.510  0.01499 * 
   `Labor Policy`               810.5      280.4   2.891  0.00546 **
   `Law & Order`               -668.3      444.6  -1.503  0.13846   

   ---Significance stars
   Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
   Residual standard error: 1425 on 56 degrees of freedom
   Multiple R-squared: 0.3885
   Adjusted R-squared: 0.2902 
   F-statistic: 3.953 on 9 and 56 DF,  p-value: 0.0006067 
   ***Extra Diagnostic information
   Residual sum of squares: 113728132
   Sigma(hat): 1333.042
   AIC:  1157.038
   AICc:  1161.927
   BIC:  1161.21
   ***********************************************************************
   *          Results of Geographically Weighted Regression              *
   ***********************************************************************

   *********************Model calibration information*********************
   Kernel function: boxcar 
   Fixed bandwidth: 1147595 
   Regression points: the same locations as observations are used.
   Distance metric: Euclidean distance metric is used.

   ****************Summary of GWR coefficient estimates:******************
                                 Min.  1st Qu.   Median  3rd Qu.      Max.
   Intercept                 -2029.95  -637.88   596.48  1123.48 1882.8340
   `Entry Costs`              -749.00  -545.78  -493.11  -414.02 -281.8974
   `Land Access`              -201.59   218.75   264.32   435.01  548.4379
   Transparency               -658.25  -383.53  -360.58  -317.80    8.2392
   `Time Costs`                333.00   407.81   412.38   445.83  514.3275
   `Informal charges`         -671.17  -532.00  -452.95  -323.95    1.2336
   Proactivity                -289.04  -175.30  -142.30   -21.09   89.7491
   `Business Support Policy`   310.81   608.78   687.04   719.28  827.6244
   `Labor Policy`              589.33   765.44   834.32   923.34 1060.9729
   `Law & Order`             -1003.16  -724.69  -674.45  -636.49 -363.5316
   ************************Diagnostic information*************************
   Number of data points: 66 
   Effective number of parameters (2trace(S) - trace(S'S)): 13.16825 
   Effective degrees of freedom (n-2trace(S) + trace(S'S)): 52.83175 
   AICc (GWR book, Fotheringham, et al. 2002, p. 61, eq 2.33): 1163.596 
   AIC (GWR book, Fotheringham, et al. 2002,GWR p. 96, eq. 4.22): 1139.972 
   BIC (GWR book, Fotheringham, et al. 2002,GWR p. 61, eq. 2.34): 1115.975 
   Residual sum of squares: 100389487 
   R-square value:  0.4602377 
   Adjusted R-square value:  0.3231069 

   ***********************************************************************
   Program stops at: 2024-11-12 17:10:52.904205 
bw.fixed_project <- bw.gwr(formula = total_project_count ~ 
                             `Entry Costs` + `Land Access` + Transparency + 
                             `Time Costs` + `Informal charges` + Proactivity + 
                             `Business Support Policy` + `Labor Policy` +
                             `Law & Order`,
                           data=pci_2021,
                           approach="aic", 
                           kernel="boxcar", 
                           adaptive=FALSE, 
                           longlat=FALSE)
Fixed bandwidth: 968784.2 AICc value: 1183.12 
Fixed bandwidth: 598861.3 AICc value: 1199.317 
Fixed bandwidth: 1197409 AICc value: 1164.742 
Fixed bandwidth: 1338707 AICc value: 1164.967 
Fixed bandwidth: 1110082 AICc value: 1176.805 
Fixed bandwidth: 1251380 AICc value: 1165.75 
Fixed bandwidth: 1164053 AICc value: 1163.872 
Fixed bandwidth: 1143438 AICc value: 1164.153 
Fixed bandwidth: 1176794 AICc value: 1165.42 
Fixed bandwidth: 1156179 AICc value: 1163.832 
Fixed bandwidth: 1151312 AICc value: 1163.795 
Fixed bandwidth: 1148305 AICc value: 1163.596 
Fixed bandwidth: 1146446 AICc value: 1164.296 
Fixed bandwidth: 1149453 AICc value: 1163.669 
Fixed bandwidth: 1147595 AICc value: 1163.596 
gwr.fixed_project <- gwr.basic(formula = total_project_count ~ 
                                 `Entry Costs` +  `Land Access` + Transparency + 
                                 `Time Costs` + `Informal charges` + Proactivity + 
                                 `Business Support Policy` + `Labor Policy` +
                                 `Law & Order`,
                               data=pci_2021,
                               bw=bw.fixed_project, 
                               kernel = 'boxcar', 
                               longlat = FALSE)

gwr.fixed_project
   ***********************************************************************
   *                       Package   GWmodel                             *
   ***********************************************************************
   Program starts at: 2024-11-12 17:10:52.963376 
   Call:
   gwr.basic(formula = total_project_count ~ `Entry Costs` + `Land Access` + 
    Transparency + `Time Costs` + `Informal charges` + Proactivity + 
    `Business Support Policy` + `Labor Policy` + `Law & Order`, 
    data = pci_2021, bw = bw.fixed_project, kernel = "boxcar", 
    longlat = FALSE)

   Dependent (y) variable:  total_project_count
   Independent variables:  Entry Costs Land Access Transparency Time Costs Informal charges Proactivity Business Support Policy Labor Policy Law & Order
   Number of data points: 66
   ***********************************************************************
   *                    Results of Global Regression                     *
   ***********************************************************************

   Call:
    lm(formula = formula, data = data)

   Residuals:
    Min      1Q  Median      3Q     Max 
-1500.4  -720.6  -243.1   295.9  7609.3 

   Coefficients:
                             Estimate Std. Error t value Pr(>|t|)   
   (Intercept)                 1303.9     4351.0   0.300  0.76553   
   `Entry Costs`               -499.2      361.4  -1.381  0.17264   
   `Land Access`                250.5      453.1   0.553  0.58261   
   Transparency                -360.6      327.3  -1.102  0.27531   
   `Time Costs`                 411.7      340.4   1.210  0.23152   
   `Informal charges`          -454.6      388.0  -1.172  0.24628   
   Proactivity                 -142.3      394.6  -0.361  0.71970   
   `Business Support Policy`    611.7      243.7   2.510  0.01499 * 
   `Labor Policy`               810.5      280.4   2.891  0.00546 **
   `Law & Order`               -668.3      444.6  -1.503  0.13846   

   ---Significance stars
   Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
   Residual standard error: 1425 on 56 degrees of freedom
   Multiple R-squared: 0.3885
   Adjusted R-squared: 0.2902 
   F-statistic: 3.953 on 9 and 56 DF,  p-value: 0.0006067 
   ***Extra Diagnostic information
   Residual sum of squares: 113728132
   Sigma(hat): 1333.042
   AIC:  1157.038
   AICc:  1161.927
   BIC:  1161.21
   ***********************************************************************
   *          Results of Geographically Weighted Regression              *
   ***********************************************************************

   *********************Model calibration information*********************
   Kernel function: boxcar 
   Fixed bandwidth: 1147595 
   Regression points: the same locations as observations are used.
   Distance metric: Euclidean distance metric is used.

   ****************Summary of GWR coefficient estimates:******************
                                 Min.  1st Qu.   Median  3rd Qu.      Max.
   Intercept                 -2029.95  -637.88   596.48  1123.48 1882.8340
   `Entry Costs`              -749.00  -545.78  -493.11  -414.02 -281.8974
   `Land Access`              -201.59   218.75   264.32   435.01  548.4379
   Transparency               -658.25  -383.53  -360.58  -317.80    8.2392
   `Time Costs`                333.00   407.81   412.38   445.83  514.3275
   `Informal charges`         -671.17  -532.00  -452.95  -323.95    1.2336
   Proactivity                -289.04  -175.30  -142.30   -21.09   89.7491
   `Business Support Policy`   310.81   608.78   687.04   719.28  827.6244
   `Labor Policy`              589.33   765.44   834.32   923.34 1060.9729
   `Law & Order`             -1003.16  -724.69  -674.45  -636.49 -363.5316
   ************************Diagnostic information*************************
   Number of data points: 66 
   Effective number of parameters (2trace(S) - trace(S'S)): 13.16825 
   Effective degrees of freedom (n-2trace(S) + trace(S'S)): 52.83175 
   AICc (GWR book, Fotheringham, et al. 2002, p. 61, eq 2.33): 1163.596 
   AIC (GWR book, Fotheringham, et al. 2002,GWR p. 96, eq. 4.22): 1139.972 
   BIC (GWR book, Fotheringham, et al. 2002,GWR p. 61, eq. 2.34): 1115.975 
   Residual sum of squares: 100389487 
   R-square value:  0.4602377 
   Adjusted R-square value:  0.3231069 

   ***********************************************************************
   Program stops at: 2024-11-12 17:10:52.987234 
approach_comparison <- read_xlsx("data/rds/project_local_fixed_approach_comparison.xlsx")

ggplot(approach_comparison, aes(x = Approach, y = `Adjusted R2`)) +
  geom_bar(stat = "identity", fill = "steelblue", color = "black") +
  labs(title = "Comparison of Approaches",
       x = "Approach",
       y = "Adjusted R2") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))  # Rotates x-axis labels if needed

bw.fixed_project <- bw.gwr(formula = total_project_count ~ 
                             `Entry Costs` + `Land Access` + Transparency + 
                             `Time Costs` + `Informal charges` + Proactivity + 
                             `Business Support Policy` + `Labor Policy` +
                             `Law & Order`,
                           data=pci_2021,
                           approach="aic", 
                           kernel="gaussian", 
                           adaptive=FALSE, 
                           longlat=FALSE)
Fixed bandwidth: 968784.2 AICc value: 1168.817 
Fixed bandwidth: 598861.3 AICc value: 1180.6 
Fixed bandwidth: 1197409 AICc value: 1166.348 
Fixed bandwidth: 1338707 AICc value: 1165.433 
Fixed bandwidth: 1426034 AICc value: 1165.003 
Fixed bandwidth: 1480005 AICc value: 1164.776 
Fixed bandwidth: 1513361 AICc value: 1164.648 
Fixed bandwidth: 1533976 AICc value: 1164.573 
Fixed bandwidth: 1546717 AICc value: 1164.528 
Fixed bandwidth: 1554591 AICc value: 1164.501 
Fixed bandwidth: 1559458 AICc value: 1164.485 
Fixed bandwidth: 1562465 AICc value: 1164.475 
Fixed bandwidth: 1564324 AICc value: 1164.469 
Fixed bandwidth: 1565473 AICc value: 1164.465 
Fixed bandwidth: 1566183 AICc value: 1164.462 
Fixed bandwidth: 1566622 AICc value: 1164.461 
Fixed bandwidth: 1566893 AICc value: 1164.46 
Fixed bandwidth: 1567061 AICc value: 1164.459 
Fixed bandwidth: 1567164 AICc value: 1164.459 
Fixed bandwidth: 1567228 AICc value: 1164.459 
Fixed bandwidth: 1567268 AICc value: 1164.459 
Fixed bandwidth: 1567292 AICc value: 1164.459 
gwr.fixed_project <- gwr.basic(formula = total_project_count ~ 
                                 `Entry Costs` +  `Land Access` + Transparency + 
                                 `Time Costs` + `Informal charges` + Proactivity + 
                                 `Business Support Policy` + `Labor Policy` +
                                 `Law & Order`,
                               data=pci_2021,
                               bw=bw.fixed_project, 
                               kernel = 'gaussian', 
                               longlat = FALSE)

gwr.fixed_project
   ***********************************************************************
   *                       Package   GWmodel                             *
   ***********************************************************************
   Program starts at: 2024-11-12 17:10:53.221664 
   Call:
   gwr.basic(formula = total_project_count ~ `Entry Costs` + `Land Access` + 
    Transparency + `Time Costs` + `Informal charges` + Proactivity + 
    `Business Support Policy` + `Labor Policy` + `Law & Order`, 
    data = pci_2021, bw = bw.fixed_project, kernel = "gaussian", 
    longlat = FALSE)

   Dependent (y) variable:  total_project_count
   Independent variables:  Entry Costs Land Access Transparency Time Costs Informal charges Proactivity Business Support Policy Labor Policy Law & Order
   Number of data points: 66
   ***********************************************************************
   *                    Results of Global Regression                     *
   ***********************************************************************

   Call:
    lm(formula = formula, data = data)

   Residuals:
    Min      1Q  Median      3Q     Max 
-1500.4  -720.6  -243.1   295.9  7609.3 

   Coefficients:
                             Estimate Std. Error t value Pr(>|t|)   
   (Intercept)                 1303.9     4351.0   0.300  0.76553   
   `Entry Costs`               -499.2      361.4  -1.381  0.17264   
   `Land Access`                250.5      453.1   0.553  0.58261   
   Transparency                -360.6      327.3  -1.102  0.27531   
   `Time Costs`                 411.7      340.4   1.210  0.23152   
   `Informal charges`          -454.6      388.0  -1.172  0.24628   
   Proactivity                 -142.3      394.6  -0.361  0.71970   
   `Business Support Policy`    611.7      243.7   2.510  0.01499 * 
   `Labor Policy`               810.5      280.4   2.891  0.00546 **
   `Law & Order`               -668.3      444.6  -1.503  0.13846   

   ---Significance stars
   Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
   Residual standard error: 1425 on 56 degrees of freedom
   Multiple R-squared: 0.3885
   Adjusted R-squared: 0.2902 
   F-statistic: 3.953 on 9 and 56 DF,  p-value: 0.0006067 
   ***Extra Diagnostic information
   Residual sum of squares: 113728132
   Sigma(hat): 1333.042
   AIC:  1157.038
   AICc:  1161.927
   BIC:  1161.21
   ***********************************************************************
   *          Results of Geographically Weighted Regression              *
   ***********************************************************************

   *********************Model calibration information*********************
   Kernel function: gaussian 
   Fixed bandwidth: 1567292 
   Regression points: the same locations as observations are used.
   Distance metric: Euclidean distance metric is used.

   ****************Summary of GWR coefficient estimates:******************
                                Min. 1st Qu.  Median 3rd Qu.    Max.
   Intercept                 1166.14 1177.05 1208.42 1244.11 1261.92
   `Entry Costs`             -505.12 -503.55 -502.28 -499.52 -496.29
   `Land Access`              199.33  214.52  259.21  289.29  302.15
   Transparency              -379.26 -373.87 -364.50 -347.82 -341.32
   `Time Costs`               408.43  410.36  414.89  418.05  419.51
   `Informal charges`        -496.98 -484.73 -458.67 -417.55 -402.98
   Proactivity               -159.37 -151.46 -137.76 -125.56 -120.11
   `Business Support Policy`  595.54  601.31  617.61  632.44  638.38
   `Labor Policy`             788.12  795.48  817.50  832.99  839.76
   `Law & Order`             -689.58 -683.71 -673.35 -660.87 -657.64
   ************************Diagnostic information*************************
   Number of data points: 66 
   Effective number of parameters (2trace(S) - trace(S'S)): 11.93413 
   Effective degrees of freedom (n-2trace(S) + trace(S'S)): 54.06587 
   AICc (GWR book, Fotheringham, et al. 2002, p. 61, eq 2.33): 1164.459 
   AIC (GWR book, Fotheringham, et al. 2002,GWR p. 96, eq. 4.22): 1145.516 
   BIC (GWR book, Fotheringham, et al. 2002,GWR p. 61, eq. 2.34): 1114.689 
   Residual sum of squares: 112786813 
   R-square value:  0.3935812 
   Adjusted R-square value:  0.257202 

   ***********************************************************************
   Program stops at: 2024-11-12 17:10:53.245548 
bw.fixed_project <- bw.gwr(formula = total_project_count ~ 
                             `Entry Costs` + `Land Access` + Transparency + 
                             `Time Costs` + `Informal charges` + Proactivity + 
                             `Business Support Policy` + `Labor Policy` +
                             `Law & Order`,
                           data=pci_2021,
                           approach="aic", 
                           kernel="exponential", 
                           adaptive=FALSE, 
                           longlat=FALSE)
Fixed bandwidth: 968784.2 AICc value: 1174.197 
Fixed bandwidth: 598861.3 AICc value: 1184.164 
Fixed bandwidth: 1197409 AICc value: 1171.483 
Fixed bandwidth: 1338707 AICc value: 1170.326 
Fixed bandwidth: 1426034 AICc value: 1169.74 
Fixed bandwidth: 1480005 AICc value: 1169.416 
Fixed bandwidth: 1513361 AICc value: 1169.229 
Fixed bandwidth: 1533976 AICc value: 1169.118 
Fixed bandwidth: 1546717 AICc value: 1169.051 
Fixed bandwidth: 1554591 AICc value: 1169.01 
Fixed bandwidth: 1559458 AICc value: 1168.985 
Fixed bandwidth: 1562465 AICc value: 1168.97 
Fixed bandwidth: 1564324 AICc value: 1168.96 
Fixed bandwidth: 1565473 AICc value: 1168.954 
Fixed bandwidth: 1566183 AICc value: 1168.951 
Fixed bandwidth: 1566622 AICc value: 1168.949 
Fixed bandwidth: 1566893 AICc value: 1168.947 
Fixed bandwidth: 1567061 AICc value: 1168.946 
Fixed bandwidth: 1567164 AICc value: 1168.946 
Fixed bandwidth: 1567228 AICc value: 1168.945 
Fixed bandwidth: 1567268 AICc value: 1168.945 
Fixed bandwidth: 1567292 AICc value: 1168.945 
Fixed bandwidth: 1567308 AICc value: 1168.945 
gwr.fixed_project <- gwr.basic(formula = total_project_count ~ 
                                 `Entry Costs` +  `Land Access` + Transparency + 
                                 `Time Costs` + `Informal charges` + Proactivity + 
                                 `Business Support Policy` + `Labor Policy` +
                                 `Law & Order`,
                               data=pci_2021,
                               bw=bw.fixed_project, 
                               kernel = 'exponential', 
                               longlat = FALSE)

gwr.fixed_project
   ***********************************************************************
   *                       Package   GWmodel                             *
   ***********************************************************************
   Program starts at: 2024-11-12 17:10:53.309016 
   Call:
   gwr.basic(formula = total_project_count ~ `Entry Costs` + `Land Access` + 
    Transparency + `Time Costs` + `Informal charges` + Proactivity + 
    `Business Support Policy` + `Labor Policy` + `Law & Order`, 
    data = pci_2021, bw = bw.fixed_project, kernel = "exponential", 
    longlat = FALSE)

   Dependent (y) variable:  total_project_count
   Independent variables:  Entry Costs Land Access Transparency Time Costs Informal charges Proactivity Business Support Policy Labor Policy Law & Order
   Number of data points: 66
   ***********************************************************************
   *                    Results of Global Regression                     *
   ***********************************************************************

   Call:
    lm(formula = formula, data = data)

   Residuals:
    Min      1Q  Median      3Q     Max 
-1500.4  -720.6  -243.1   295.9  7609.3 

   Coefficients:
                             Estimate Std. Error t value Pr(>|t|)   
   (Intercept)                 1303.9     4351.0   0.300  0.76553   
   `Entry Costs`               -499.2      361.4  -1.381  0.17264   
   `Land Access`                250.5      453.1   0.553  0.58261   
   Transparency                -360.6      327.3  -1.102  0.27531   
   `Time Costs`                 411.7      340.4   1.210  0.23152   
   `Informal charges`          -454.6      388.0  -1.172  0.24628   
   Proactivity                 -142.3      394.6  -0.361  0.71970   
   `Business Support Policy`    611.7      243.7   2.510  0.01499 * 
   `Labor Policy`               810.5      280.4   2.891  0.00546 **
   `Law & Order`               -668.3      444.6  -1.503  0.13846   

   ---Significance stars
   Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
   Residual standard error: 1425 on 56 degrees of freedom
   Multiple R-squared: 0.3885
   Adjusted R-squared: 0.2902 
   F-statistic: 3.953 on 9 and 56 DF,  p-value: 0.0006067 
   ***Extra Diagnostic information
   Residual sum of squares: 113728132
   Sigma(hat): 1333.042
   AIC:  1157.038
   AICc:  1161.927
   BIC:  1161.21
   ***********************************************************************
   *          Results of Geographically Weighted Regression              *
   ***********************************************************************

   *********************Model calibration information*********************
   Kernel function: exponential 
   Fixed bandwidth: 1567308 
   Regression points: the same locations as observations are used.
   Distance metric: Euclidean distance metric is used.

   ****************Summary of GWR coefficient estimates:******************
                                 Min.  1st Qu.   Median  3rd Qu.     Max.
   Intercept                  921.098 1102.324 1237.589 1307.663 1464.419
   `Entry Costs`             -512.198 -501.489 -497.444 -492.974 -473.137
   `Land Access`              134.633  166.507  258.237  328.913  347.230
   Transparency              -413.251 -392.057 -359.485 -349.276 -322.875
   `Time Costs`               394.279  398.375  420.596  430.652  439.401
   `Informal charges`        -552.286 -533.504 -466.434 -349.795 -340.288
   Proactivity               -205.988 -185.073 -129.074  -94.547  -82.668
   `Business Support Policy`  571.054  590.459  617.505  670.660  692.679
   `Labor Policy`             758.142  782.903  815.022  864.178  888.965
   `Law & Order`             -733.490 -707.372 -689.447 -669.442 -650.840
   ************************Diagnostic information*************************
   Number of data points: 66 
   Effective number of parameters (2trace(S) - trace(S'S)): 16.41498 
   Effective degrees of freedom (n-2trace(S) + trace(S'S)): 49.58502 
   AICc (GWR book, Fotheringham, et al. 2002, p. 61, eq 2.33): 1168.945 
   AIC (GWR book, Fotheringham, et al. 2002,GWR p. 96, eq. 4.22): 1144.407 
   BIC (GWR book, Fotheringham, et al. 2002,GWR p. 61, eq. 2.34): 1121.652 
   Residual sum of squares: 106735609 
   R-square value:  0.4261166 
   Adjusted R-square value:  0.2322238 

   ***********************************************************************
   Program stops at: 2024-11-12 17:10:53.332372 
bw.fixed_project <- bw.gwr(formula = total_project_count ~ 
                             `Entry Costs` + `Land Access` + Transparency + 
                             `Time Costs` + `Informal charges` + Proactivity + 
                             `Business Support Policy` + `Labor Policy` +
                             `Law & Order`,
                           data=pci_2021,
                           approach="aic", 
                           kernel="bisquare", 
                           adaptive=FALSE, 
                           longlat=FALSE)
Fixed bandwidth: 968784.2 AICc value: 1199.911 
Fixed bandwidth: 598861.3 AICc value: 1235.239 
Fixed bandwidth: 1197409 AICc value: 1190.84 
Fixed bandwidth: 1338707 AICc value: 1184.972 
Fixed bandwidth: 1426034 AICc value: 1181.649 
Fixed bandwidth: 1480005 AICc value: 1179.806 
Fixed bandwidth: 1513361 AICc value: 1178.756 
Fixed bandwidth: 1533976 AICc value: 1178.143 
Fixed bandwidth: 1546717 AICc value: 1177.778 
Fixed bandwidth: 1554591 AICc value: 1177.558 
Fixed bandwidth: 1559458 AICc value: 1177.423 
Fixed bandwidth: 1562465 AICc value: 1177.341 
Fixed bandwidth: 1564324 AICc value: 1177.291 
Fixed bandwidth: 1565473 AICc value: 1177.26 
Fixed bandwidth: 1566183 AICc value: 1177.24 
Fixed bandwidth: 1566622 AICc value: 1177.229 
Fixed bandwidth: 1566893 AICc value: 1177.221 
Fixed bandwidth: 1567061 AICc value: 1177.217 
Fixed bandwidth: 1567164 AICc value: 1177.214 
Fixed bandwidth: 1567228 AICc value: 1177.212 
Fixed bandwidth: 1567268 AICc value: 1177.211 
Fixed bandwidth: 1567292 AICc value: 1177.211 
Fixed bandwidth: 1567308 AICc value: 1177.21 
Fixed bandwidth: 1567317 AICc value: 1177.21 
Fixed bandwidth: 1567323 AICc value: 1177.21 
Fixed bandwidth: 1567326 AICc value: 1177.21 
gwr.fixed_project <- gwr.basic(formula = total_project_count ~ 
                                 `Entry Costs` +  `Land Access` + Transparency + 
                                 `Time Costs` + `Informal charges` + Proactivity + 
                                 `Business Support Policy` + `Labor Policy` +
                                 `Law & Order`,
                               data=pci_2021,
                               bw=bw.fixed_project, 
                               kernel = 'bisquare', 
                               longlat = FALSE)

gwr.fixed_project
   ***********************************************************************
   *                       Package   GWmodel                             *
   ***********************************************************************
   Program starts at: 2024-11-12 17:10:53.397755 
   Call:
   gwr.basic(formula = total_project_count ~ `Entry Costs` + `Land Access` + 
    Transparency + `Time Costs` + `Informal charges` + Proactivity + 
    `Business Support Policy` + `Labor Policy` + `Law & Order`, 
    data = pci_2021, bw = bw.fixed_project, kernel = "bisquare", 
    longlat = FALSE)

   Dependent (y) variable:  total_project_count
   Independent variables:  Entry Costs Land Access Transparency Time Costs Informal charges Proactivity Business Support Policy Labor Policy Law & Order
   Number of data points: 66
   ***********************************************************************
   *                    Results of Global Regression                     *
   ***********************************************************************

   Call:
    lm(formula = formula, data = data)

   Residuals:
    Min      1Q  Median      3Q     Max 
-1500.4  -720.6  -243.1   295.9  7609.3 

   Coefficients:
                             Estimate Std. Error t value Pr(>|t|)   
   (Intercept)                 1303.9     4351.0   0.300  0.76553   
   `Entry Costs`               -499.2      361.4  -1.381  0.17264   
   `Land Access`                250.5      453.1   0.553  0.58261   
   Transparency                -360.6      327.3  -1.102  0.27531   
   `Time Costs`                 411.7      340.4   1.210  0.23152   
   `Informal charges`          -454.6      388.0  -1.172  0.24628   
   Proactivity                 -142.3      394.6  -0.361  0.71970   
   `Business Support Policy`    611.7      243.7   2.510  0.01499 * 
   `Labor Policy`               810.5      280.4   2.891  0.00546 **
   `Law & Order`               -668.3      444.6  -1.503  0.13846   

   ---Significance stars
   Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
   Residual standard error: 1425 on 56 degrees of freedom
   Multiple R-squared: 0.3885
   Adjusted R-squared: 0.2902 
   F-statistic: 3.953 on 9 and 56 DF,  p-value: 0.0006067 
   ***Extra Diagnostic information
   Residual sum of squares: 113728132
   Sigma(hat): 1333.042
   AIC:  1157.038
   AICc:  1161.927
   BIC:  1161.21
   ***********************************************************************
   *          Results of Geographically Weighted Regression              *
   ***********************************************************************

   *********************Model calibration information*********************
   Kernel function: bisquare 
   Fixed bandwidth: 1567326 
   Regression points: the same locations as observations are used.
   Distance metric: Euclidean distance metric is used.

   ****************Summary of GWR coefficient estimates:******************
                                 Min.  1st Qu.   Median  3rd Qu.     Max.
   Intercept                  171.528  286.920  825.373 1145.121 1458.521
   `Entry Costs`             -521.675 -484.885 -458.434 -427.223 -358.813
   `Land Access`              -90.262   38.677  292.007  415.692  436.891
   Transparency              -443.170 -428.465 -379.856 -306.994 -298.914
   `Time Costs`               354.698  390.184  427.766  435.608  440.747
   `Informal charges`        -638.719 -569.120 -468.949 -243.768  -71.021
   Proactivity               -218.743 -172.053 -120.470  -39.716   27.323
   `Business Support Policy`  486.249  537.993  636.984  724.655  767.989
   `Labor Policy`             646.760  720.216  845.045  920.186  943.933
   `Law & Order`             -806.220 -777.004 -703.633 -656.433 -644.898
   ************************Diagnostic information*************************
   Number of data points: 66 
   Effective number of parameters (2trace(S) - trace(S'S)): 17.98289 
   Effective degrees of freedom (n-2trace(S) + trace(S'S)): 48.01711 
   AICc (GWR book, Fotheringham, et al. 2002, p. 61, eq 2.33): 1177.21 
   AIC (GWR book, Fotheringham, et al. 2002,GWR p. 96, eq. 4.22): 1148.049 
   BIC (GWR book, Fotheringham, et al. 2002,GWR p. 61, eq. 2.34): 1131.191 
   Residual sum of squares: 109674947 
   R-square value:  0.4103127 
   Adjusted R-square value:  0.1847718 

   ***********************************************************************
   Program stops at: 2024-11-12 17:10:53.422031 
bw.fixed_project <- bw.gwr(formula = total_project_count ~ 
                             `Entry Costs` + `Land Access` + Transparency + 
                             `Time Costs` + `Informal charges` + Proactivity + 
                             `Business Support Policy` + `Labor Policy` +
                             `Law & Order`,
                           data=pci_2021,
                           approach="aic", 
                           kernel="tricube", 
                           adaptive=FALSE, 
                           longlat=FALSE)
Fixed bandwidth: 968784.2 AICc value: 1198.024 
Fixed bandwidth: 598861.3 AICc value: 1232.151 
Fixed bandwidth: 1197409 AICc value: 1189.473 
Fixed bandwidth: 1338707 AICc value: 1184.055 
Fixed bandwidth: 1426034 AICc value: 1180.63 
Fixed bandwidth: 1480005 AICc value: 1178.676 
Fixed bandwidth: 1513361 AICc value: 1177.551 
Fixed bandwidth: 1533976 AICc value: 1176.89 
Fixed bandwidth: 1546717 AICc value: 1176.495 
Fixed bandwidth: 1554591 AICc value: 1176.256 
Fixed bandwidth: 1559458 AICc value: 1176.111 
Fixed bandwidth: 1562465 AICc value: 1176.022 
Fixed bandwidth: 1564324 AICc value: 1175.967 
Fixed bandwidth: 1565473 AICc value: 1175.933 
Fixed bandwidth: 1566183 AICc value: 1175.912 
Fixed bandwidth: 1566622 AICc value: 1175.899 
Fixed bandwidth: 1566893 AICc value: 1175.891 
Fixed bandwidth: 1567061 AICc value: 1175.886 
Fixed bandwidth: 1567164 AICc value: 1175.883 
Fixed bandwidth: 1567228 AICc value: 1175.881 
Fixed bandwidth: 1567268 AICc value: 1175.88 
Fixed bandwidth: 1567292 AICc value: 1175.88 
Fixed bandwidth: 1567308 AICc value: 1175.879 
Fixed bandwidth: 1567317 AICc value: 1175.879 
Fixed bandwidth: 1567323 AICc value: 1175.879 
Fixed bandwidth: 1567326 AICc value: 1175.879 
Fixed bandwidth: 1567328 AICc value: 1175.879 
gwr.fixed_project <- gwr.basic(formula = total_project_count ~ 
                                 `Entry Costs` +  `Land Access` + Transparency + 
                                 `Time Costs` + `Informal charges` + Proactivity + 
                                 `Business Support Policy` + `Labor Policy` +
                                 `Law & Order`,
                               data=pci_2021,
                               bw=bw.fixed_project, 
                               kernel = 'tricube', 
                               longlat = FALSE)

gwr.fixed_project
   ***********************************************************************
   *                       Package   GWmodel                             *
   ***********************************************************************
   Program starts at: 2024-11-12 17:10:53.492949 
   Call:
   gwr.basic(formula = total_project_count ~ `Entry Costs` + `Land Access` + 
    Transparency + `Time Costs` + `Informal charges` + Proactivity + 
    `Business Support Policy` + `Labor Policy` + `Law & Order`, 
    data = pci_2021, bw = bw.fixed_project, kernel = "tricube", 
    longlat = FALSE)

   Dependent (y) variable:  total_project_count
   Independent variables:  Entry Costs Land Access Transparency Time Costs Informal charges Proactivity Business Support Policy Labor Policy Law & Order
   Number of data points: 66
   ***********************************************************************
   *                    Results of Global Regression                     *
   ***********************************************************************

   Call:
    lm(formula = formula, data = data)

   Residuals:
    Min      1Q  Median      3Q     Max 
-1500.4  -720.6  -243.1   295.9  7609.3 

   Coefficients:
                             Estimate Std. Error t value Pr(>|t|)   
   (Intercept)                 1303.9     4351.0   0.300  0.76553   
   `Entry Costs`               -499.2      361.4  -1.381  0.17264   
   `Land Access`                250.5      453.1   0.553  0.58261   
   Transparency                -360.6      327.3  -1.102  0.27531   
   `Time Costs`                 411.7      340.4   1.210  0.23152   
   `Informal charges`          -454.6      388.0  -1.172  0.24628   
   Proactivity                 -142.3      394.6  -0.361  0.71970   
   `Business Support Policy`    611.7      243.7   2.510  0.01499 * 
   `Labor Policy`               810.5      280.4   2.891  0.00546 **
   `Law & Order`               -668.3      444.6  -1.503  0.13846   

   ---Significance stars
   Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
   Residual standard error: 1425 on 56 degrees of freedom
   Multiple R-squared: 0.3885
   Adjusted R-squared: 0.2902 
   F-statistic: 3.953 on 9 and 56 DF,  p-value: 0.0006067 
   ***Extra Diagnostic information
   Residual sum of squares: 113728132
   Sigma(hat): 1333.042
   AIC:  1157.038
   AICc:  1161.927
   BIC:  1161.21
   ***********************************************************************
   *          Results of Geographically Weighted Regression              *
   ***********************************************************************

   *********************Model calibration information*********************
   Kernel function: tricube 
   Fixed bandwidth: 1567328 
   Regression points: the same locations as observations are used.
   Distance metric: Euclidean distance metric is used.

   ****************Summary of GWR coefficient estimates:******************
                                 Min.  1st Qu.   Median  3rd Qu.     Max.
   Intercept                 -143.453   57.337  940.942 1011.952 1429.612
   `Entry Costs`             -525.115 -490.531 -458.812 -422.186 -351.397
   `Land Access`             -101.707   38.604  294.537  422.736  441.667
   Transparency              -428.220 -421.746 -387.704 -290.237 -282.847
   `Time Costs`               347.498  399.838  422.606  430.794  435.815
   `Informal charges`        -644.405 -549.931 -462.693 -263.635  -57.088
   Proactivity               -199.521 -161.045 -123.284  -47.232   37.181
   `Business Support Policy`  464.228  531.809  641.338  714.835  760.311
   `Labor Policy`             634.721  719.736  849.435  906.144  936.084
   `Law & Order`             -792.183 -774.151 -708.416 -629.165 -608.759
   ************************Diagnostic information*************************
   Number of data points: 66 
   Effective number of parameters (2trace(S) - trace(S'S)): 16.80409 
   Effective degrees of freedom (n-2trace(S) + trace(S'S)): 49.19591 
   AICc (GWR book, Fotheringham, et al. 2002, p. 61, eq 2.33): 1175.879 
   AIC (GWR book, Fotheringham, et al. 2002,GWR p. 96, eq. 4.22): 1148.275 
   BIC (GWR book, Fotheringham, et al. 2002,GWR p. 61, eq. 2.34): 1129.498 
   Residual sum of squares: 111058937 
   R-square value:  0.4028715 
   Adjusted R-square value:  0.1946754 

   ***********************************************************************
   Program stops at: 2024-11-12 17:10:53.517633 
bw.fixed_project <- bw.gwr(formula = total_project_count ~ 
                             `Entry Costs` + `Land Access` + Transparency + 
                             `Time Costs` + `Informal charges` + Proactivity + 
                             `Business Support Policy` + `Labor Policy` +
                             `Law & Order`,
                           data=pci_2021,
                           approach="aic", 
                           kernel="boxcar", 
                           adaptive=FALSE, 
                           longlat=FALSE)
Fixed bandwidth: 968784.2 AICc value: 1183.12 
Fixed bandwidth: 598861.3 AICc value: 1199.317 
Fixed bandwidth: 1197409 AICc value: 1164.742 
Fixed bandwidth: 1338707 AICc value: 1164.967 
Fixed bandwidth: 1110082 AICc value: 1176.805 
Fixed bandwidth: 1251380 AICc value: 1165.75 
Fixed bandwidth: 1164053 AICc value: 1163.872 
Fixed bandwidth: 1143438 AICc value: 1164.153 
Fixed bandwidth: 1176794 AICc value: 1165.42 
Fixed bandwidth: 1156179 AICc value: 1163.832 
Fixed bandwidth: 1151312 AICc value: 1163.795 
Fixed bandwidth: 1148305 AICc value: 1163.596 
Fixed bandwidth: 1146446 AICc value: 1164.296 
Fixed bandwidth: 1149453 AICc value: 1163.669 
Fixed bandwidth: 1147595 AICc value: 1163.596 
gwr.fixed_project <- gwr.basic(formula = total_project_count ~ 
                                 `Entry Costs` +  `Land Access` + Transparency + 
                                 `Time Costs` + `Informal charges` + Proactivity + 
                                 `Business Support Policy` + `Labor Policy` +
                                 `Law & Order`,
                               data=pci_2021,
                               bw=bw.fixed_project, 
                               kernel = 'boxcar', 
                               longlat = FALSE)

gwr.fixed_project
   ***********************************************************************
   *                       Package   GWmodel                             *
   ***********************************************************************
   Program starts at: 2024-11-12 17:10:53.573714 
   Call:
   gwr.basic(formula = total_project_count ~ `Entry Costs` + `Land Access` + 
    Transparency + `Time Costs` + `Informal charges` + Proactivity + 
    `Business Support Policy` + `Labor Policy` + `Law & Order`, 
    data = pci_2021, bw = bw.fixed_project, kernel = "boxcar", 
    longlat = FALSE)

   Dependent (y) variable:  total_project_count
   Independent variables:  Entry Costs Land Access Transparency Time Costs Informal charges Proactivity Business Support Policy Labor Policy Law & Order
   Number of data points: 66
   ***********************************************************************
   *                    Results of Global Regression                     *
   ***********************************************************************

   Call:
    lm(formula = formula, data = data)

   Residuals:
    Min      1Q  Median      3Q     Max 
-1500.4  -720.6  -243.1   295.9  7609.3 

   Coefficients:
                             Estimate Std. Error t value Pr(>|t|)   
   (Intercept)                 1303.9     4351.0   0.300  0.76553   
   `Entry Costs`               -499.2      361.4  -1.381  0.17264   
   `Land Access`                250.5      453.1   0.553  0.58261   
   Transparency                -360.6      327.3  -1.102  0.27531   
   `Time Costs`                 411.7      340.4   1.210  0.23152   
   `Informal charges`          -454.6      388.0  -1.172  0.24628   
   Proactivity                 -142.3      394.6  -0.361  0.71970   
   `Business Support Policy`    611.7      243.7   2.510  0.01499 * 
   `Labor Policy`               810.5      280.4   2.891  0.00546 **
   `Law & Order`               -668.3      444.6  -1.503  0.13846   

   ---Significance stars
   Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
   Residual standard error: 1425 on 56 degrees of freedom
   Multiple R-squared: 0.3885
   Adjusted R-squared: 0.2902 
   F-statistic: 3.953 on 9 and 56 DF,  p-value: 0.0006067 
   ***Extra Diagnostic information
   Residual sum of squares: 113728132
   Sigma(hat): 1333.042
   AIC:  1157.038
   AICc:  1161.927
   BIC:  1161.21
   ***********************************************************************
   *          Results of Geographically Weighted Regression              *
   ***********************************************************************

   *********************Model calibration information*********************
   Kernel function: boxcar 
   Fixed bandwidth: 1147595 
   Regression points: the same locations as observations are used.
   Distance metric: Euclidean distance metric is used.

   ****************Summary of GWR coefficient estimates:******************
                                 Min.  1st Qu.   Median  3rd Qu.      Max.
   Intercept                 -2029.95  -637.88   596.48  1123.48 1882.8340
   `Entry Costs`              -749.00  -545.78  -493.11  -414.02 -281.8974
   `Land Access`              -201.59   218.75   264.32   435.01  548.4379
   Transparency               -658.25  -383.53  -360.58  -317.80    8.2392
   `Time Costs`                333.00   407.81   412.38   445.83  514.3275
   `Informal charges`         -671.17  -532.00  -452.95  -323.95    1.2336
   Proactivity                -289.04  -175.30  -142.30   -21.09   89.7491
   `Business Support Policy`   310.81   608.78   687.04   719.28  827.6244
   `Labor Policy`              589.33   765.44   834.32   923.34 1060.9729
   `Law & Order`             -1003.16  -724.69  -674.45  -636.49 -363.5316
   ************************Diagnostic information*************************
   Number of data points: 66 
   Effective number of parameters (2trace(S) - trace(S'S)): 13.16825 
   Effective degrees of freedom (n-2trace(S) + trace(S'S)): 52.83175 
   AICc (GWR book, Fotheringham, et al. 2002, p. 61, eq 2.33): 1163.596 
   AIC (GWR book, Fotheringham, et al. 2002,GWR p. 96, eq. 4.22): 1139.972 
   BIC (GWR book, Fotheringham, et al. 2002,GWR p. 61, eq. 2.34): 1115.975 
   Residual sum of squares: 100389487 
   R-square value:  0.4602377 
   Adjusted R-square value:  0.3231069 

   ***********************************************************************
   Program stops at: 2024-11-12 17:10:53.59693 
kernel_comparison <- read_xlsx("data/rds/project_local_fixed_kernel_comparison.xlsx")

ggplot(kernel_comparison, aes(x = reorder(Kernel, `Adjusted R2`), y = `Adjusted R2`)) +
  geom_bar(stat = "identity", fill = "steelblue", color = "black") +
  labs(title = "Comparison of Kernel method",
       x = "Kernel",
       y = "Adjusted R2") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))  # Rotates x-axis labels if needed

Note

After evaluating the various approaches and kernel types, we found that there were no differences in performance across the different approaches.

However, when it came to kernel selection, the ‘boxcar’ kernel consistently outperformed the others, achieving the highest Adjusted R². This suggests that, for this particular dataset and modeling context, the ‘boxcar’ kernel offers the most effective fit.

bw.adaptive_project <- bw.gwr(formula = total_project_count ~ 
                                `Entry Costs` + `Land Access` + Transparency + 
                                `Time Costs` + `Informal charges` + Proactivity +
                                `Business Support Policy` + `Labor Policy` +
                                `Law & Order`,
                              data=pci_2021,
                              approach="CV", 
                              kernel="boxcar", 
                              adaptive=TRUE, 
                              longlat=FALSE)
Adaptive bandwidth: 48 CV score: 204424128 
Adaptive bandwidth: 38 CV score: 235355686 
Adaptive bandwidth: 56 CV score: 181731389 
Adaptive bandwidth: 59 CV score: 174435740 
Adaptive bandwidth: 63 CV score: 168499922 
Adaptive bandwidth: 63 CV score: 168499922 
gwr.adaptive_project <- gwr.basic(formula = total_project_count ~ 
                                    `Entry Costs` + `Land Access` + Transparency +
                                    `Time Costs` + `Informal charges` + Proactivity + 
                                    `Business Support Policy` + `Labor Policy` +
                                    `Law & Order`,
                                  data=pci_2021,
                                  bw=bw.adaptive_project,
                                  kernel = 'boxcar',
                                  adaptive=TRUE, 
                                  longlat = FALSE)

gwr.adaptive_project
   ***********************************************************************
   *                       Package   GWmodel                             *
   ***********************************************************************
   Program starts at: 2024-11-12 17:10:53.850187 
   Call:
   gwr.basic(formula = total_project_count ~ `Entry Costs` + `Land Access` + 
    Transparency + `Time Costs` + `Informal charges` + Proactivity + 
    `Business Support Policy` + `Labor Policy` + `Law & Order`, 
    data = pci_2021, bw = bw.adaptive_project, kernel = "boxcar", 
    adaptive = TRUE, longlat = FALSE)

   Dependent (y) variable:  total_project_count
   Independent variables:  Entry Costs Land Access Transparency Time Costs Informal charges Proactivity Business Support Policy Labor Policy Law & Order
   Number of data points: 66
   ***********************************************************************
   *                    Results of Global Regression                     *
   ***********************************************************************

   Call:
    lm(formula = formula, data = data)

   Residuals:
    Min      1Q  Median      3Q     Max 
-1500.4  -720.6  -243.1   295.9  7609.3 

   Coefficients:
                             Estimate Std. Error t value Pr(>|t|)   
   (Intercept)                 1303.9     4351.0   0.300  0.76553   
   `Entry Costs`               -499.2      361.4  -1.381  0.17264   
   `Land Access`                250.5      453.1   0.553  0.58261   
   Transparency                -360.6      327.3  -1.102  0.27531   
   `Time Costs`                 411.7      340.4   1.210  0.23152   
   `Informal charges`          -454.6      388.0  -1.172  0.24628   
   Proactivity                 -142.3      394.6  -0.361  0.71970   
   `Business Support Policy`    611.7      243.7   2.510  0.01499 * 
   `Labor Policy`               810.5      280.4   2.891  0.00546 **
   `Law & Order`               -668.3      444.6  -1.503  0.13846   

   ---Significance stars
   Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
   Residual standard error: 1425 on 56 degrees of freedom
   Multiple R-squared: 0.3885
   Adjusted R-squared: 0.2902 
   F-statistic: 3.953 on 9 and 56 DF,  p-value: 0.0006067 
   ***Extra Diagnostic information
   Residual sum of squares: 113728132
   Sigma(hat): 1333.042
   AIC:  1157.038
   AICc:  1161.927
   BIC:  1161.21
   ***********************************************************************
   *          Results of Geographically Weighted Regression              *
   ***********************************************************************

   *********************Model calibration information*********************
   Kernel function: boxcar 
   Adaptive bandwidth: 63 (number of nearest neighbours)
   Regression points: the same locations as observations are used.
   Distance metric: Euclidean distance metric is used.

   ****************Summary of GWR coefficient estimates:******************
                                Min. 1st Qu.  Median 3rd Qu.     Max.
   Intercept                  214.29  214.29  971.88  971.88 1420.200
   `Entry Costs`             -575.15 -575.15 -561.30 -561.30 -463.273
   `Land Access`              221.39  221.39  289.30  353.31  353.307
   Transparency              -433.66 -353.89 -353.89 -333.83 -333.826
   `Time Costs`               359.70  414.10  414.10  468.73  468.725
   `Informal charges`        -487.83 -456.36 -456.36 -356.18 -356.180
   Proactivity               -187.82 -187.82 -140.26 -139.39  -48.202
   `Business Support Policy`  603.83  634.57  634.57  677.86  677.865
   `Labor Policy`             809.11  842.05  864.88  864.88  937.711
   `Law & Order`             -826.69 -737.49 -729.78 -637.65 -637.654
   ************************Diagnostic information*************************
   Number of data points: 66 
   Effective number of parameters (2trace(S) - trace(S'S)): 10.32786 
   Effective degrees of freedom (n-2trace(S) + trace(S'S)): 55.67214 
   AICc (GWR book, Fotheringham, et al. 2002, p. 61, eq 2.33): 1163.372 
   AIC (GWR book, Fotheringham, et al. 2002,GWR p. 96, eq. 4.22): 1145.841 
   BIC (GWR book, Fotheringham, et al. 2002,GWR p. 61, eq. 2.34): 1112.783 
   Residual sum of squares: 114549287 
   R-square value:  0.3841049 
   Adjusted R-square value:  0.267759 

   ***********************************************************************
   Program stops at: 2024-11-12 17:10:53.874991 
bw.adaptive_project <- bw.gwr(formula = total_project_count ~ 
                                `Entry Costs` + `Land Access` + Transparency + 
                                `Time Costs` + `Informal charges` + Proactivity +
                                `Business Support Policy` + `Labor Policy` +
                                `Law & Order`,
                              data=pci_2021,
                              approach="aic", 
                              kernel="boxcar", 
                              adaptive=TRUE, 
                              longlat=FALSE)
Adaptive bandwidth (number of nearest neighbours): 48 AICc value: 1174.786 
Adaptive bandwidth (number of nearest neighbours): 38 AICc value: 1187.643 
Adaptive bandwidth (number of nearest neighbours): 56 AICc value: 1168.848 
Adaptive bandwidth (number of nearest neighbours): 59 AICc value: 1166.084 
Adaptive bandwidth (number of nearest neighbours): 63 AICc value: 1163.372 
Adaptive bandwidth (number of nearest neighbours): 63 AICc value: 1163.372 
gwr.adaptive_project <- gwr.basic(formula = total_project_count ~ 
                                    `Entry Costs` + `Land Access` + Transparency +
                                    `Time Costs` + `Informal charges` + Proactivity + 
                                    `Business Support Policy` + `Labor Policy` +
                                    `Law & Order`,
                                  data=pci_2021,
                                  bw=bw.adaptive_project,
                                  kernel = 'boxcar',
                                  adaptive=TRUE, 
                                  longlat = FALSE)

gwr.adaptive_project
   ***********************************************************************
   *                       Package   GWmodel                             *
   ***********************************************************************
   Program starts at: 2024-11-12 17:10:53.922189 
   Call:
   gwr.basic(formula = total_project_count ~ `Entry Costs` + `Land Access` + 
    Transparency + `Time Costs` + `Informal charges` + Proactivity + 
    `Business Support Policy` + `Labor Policy` + `Law & Order`, 
    data = pci_2021, bw = bw.adaptive_project, kernel = "boxcar", 
    adaptive = TRUE, longlat = FALSE)

   Dependent (y) variable:  total_project_count
   Independent variables:  Entry Costs Land Access Transparency Time Costs Informal charges Proactivity Business Support Policy Labor Policy Law & Order
   Number of data points: 66
   ***********************************************************************
   *                    Results of Global Regression                     *
   ***********************************************************************

   Call:
    lm(formula = formula, data = data)

   Residuals:
    Min      1Q  Median      3Q     Max 
-1500.4  -720.6  -243.1   295.9  7609.3 

   Coefficients:
                             Estimate Std. Error t value Pr(>|t|)   
   (Intercept)                 1303.9     4351.0   0.300  0.76553   
   `Entry Costs`               -499.2      361.4  -1.381  0.17264   
   `Land Access`                250.5      453.1   0.553  0.58261   
   Transparency                -360.6      327.3  -1.102  0.27531   
   `Time Costs`                 411.7      340.4   1.210  0.23152   
   `Informal charges`          -454.6      388.0  -1.172  0.24628   
   Proactivity                 -142.3      394.6  -0.361  0.71970   
   `Business Support Policy`    611.7      243.7   2.510  0.01499 * 
   `Labor Policy`               810.5      280.4   2.891  0.00546 **
   `Law & Order`               -668.3      444.6  -1.503  0.13846   

   ---Significance stars
   Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
   Residual standard error: 1425 on 56 degrees of freedom
   Multiple R-squared: 0.3885
   Adjusted R-squared: 0.2902 
   F-statistic: 3.953 on 9 and 56 DF,  p-value: 0.0006067 
   ***Extra Diagnostic information
   Residual sum of squares: 113728132
   Sigma(hat): 1333.042
   AIC:  1157.038
   AICc:  1161.927
   BIC:  1161.21
   ***********************************************************************
   *          Results of Geographically Weighted Regression              *
   ***********************************************************************

   *********************Model calibration information*********************
   Kernel function: boxcar 
   Adaptive bandwidth: 63 (number of nearest neighbours)
   Regression points: the same locations as observations are used.
   Distance metric: Euclidean distance metric is used.

   ****************Summary of GWR coefficient estimates:******************
                                Min. 1st Qu.  Median 3rd Qu.     Max.
   Intercept                  214.29  214.29  971.88  971.88 1420.200
   `Entry Costs`             -575.15 -575.15 -561.30 -561.30 -463.273
   `Land Access`              221.39  221.39  289.30  353.31  353.307
   Transparency              -433.66 -353.89 -353.89 -333.83 -333.826
   `Time Costs`               359.70  414.10  414.10  468.73  468.725
   `Informal charges`        -487.83 -456.36 -456.36 -356.18 -356.180
   Proactivity               -187.82 -187.82 -140.26 -139.39  -48.202
   `Business Support Policy`  603.83  634.57  634.57  677.86  677.865
   `Labor Policy`             809.11  842.05  864.88  864.88  937.711
   `Law & Order`             -826.69 -737.49 -729.78 -637.65 -637.654
   ************************Diagnostic information*************************
   Number of data points: 66 
   Effective number of parameters (2trace(S) - trace(S'S)): 10.32786 
   Effective degrees of freedom (n-2trace(S) + trace(S'S)): 55.67214 
   AICc (GWR book, Fotheringham, et al. 2002, p. 61, eq 2.33): 1163.372 
   AIC (GWR book, Fotheringham, et al. 2002,GWR p. 96, eq. 4.22): 1145.841 
   BIC (GWR book, Fotheringham, et al. 2002,GWR p. 61, eq. 2.34): 1112.783 
   Residual sum of squares: 114549287 
   R-square value:  0.3841049 
   Adjusted R-square value:  0.267759 

   ***********************************************************************
   Program stops at: 2024-11-12 17:10:53.944927 
approach_comparison <- read_xlsx("data/rds/project_local_adaptive_approach_comparison.xlsx")

ggplot(approach_comparison, aes(x = Approach, y = `Adjusted R2`)) +
  geom_bar(stat = "identity", fill = "steelblue", color = "black") +
  labs(title = "Comparison of Approaches",
       x = "Approach",
       y = "Adjusted R2") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))  # Rotates x-axis labels if needed

bw.adaptive_project <- bw.gwr(formula = total_project_count ~ 
                                `Entry Costs` + `Land Access` + Transparency + 
                                `Time Costs` + `Informal charges` + Proactivity +
                                `Business Support Policy` + `Labor Policy` +
                                `Law & Order`,
                              data=pci_2021,
                              approach="CV", 
                              kernel="gaussian", 
                              adaptive=TRUE, 
                              longlat=FALSE)
Adaptive bandwidth: 48 CV score: 177627161 
Adaptive bandwidth: 38 CV score: 190695398 
Adaptive bandwidth: 56 CV score: 176034250 
Adaptive bandwidth: 59 CV score: 175254193 
Adaptive bandwidth: 63 CV score: 174198771 
Adaptive bandwidth: 63 CV score: 174198771 
gwr.adaptive_project <- gwr.basic(formula = total_project_count ~ 
                                    `Entry Costs` + `Land Access` + Transparency +
                                    `Time Costs` + `Informal charges` + Proactivity + 
                                    `Business Support Policy` + `Labor Policy` +
                                    `Law & Order`,
                                  data=pci_2021,
                                  bw=bw.adaptive_project,
                                  kernel = 'gaussian',
                                  adaptive=TRUE, 
                                  longlat = FALSE)

gwr.adaptive_project
   ***********************************************************************
   *                       Package   GWmodel                             *
   ***********************************************************************
   Program starts at: 2024-11-12 17:10:54.162068 
   Call:
   gwr.basic(formula = total_project_count ~ `Entry Costs` + `Land Access` + 
    Transparency + `Time Costs` + `Informal charges` + Proactivity + 
    `Business Support Policy` + `Labor Policy` + `Law & Order`, 
    data = pci_2021, bw = bw.adaptive_project, kernel = "gaussian", 
    adaptive = TRUE, longlat = FALSE)

   Dependent (y) variable:  total_project_count
   Independent variables:  Entry Costs Land Access Transparency Time Costs Informal charges Proactivity Business Support Policy Labor Policy Law & Order
   Number of data points: 66
   ***********************************************************************
   *                    Results of Global Regression                     *
   ***********************************************************************

   Call:
    lm(formula = formula, data = data)

   Residuals:
    Min      1Q  Median      3Q     Max 
-1500.4  -720.6  -243.1   295.9  7609.3 

   Coefficients:
                             Estimate Std. Error t value Pr(>|t|)   
   (Intercept)                 1303.9     4351.0   0.300  0.76553   
   `Entry Costs`               -499.2      361.4  -1.381  0.17264   
   `Land Access`                250.5      453.1   0.553  0.58261   
   Transparency                -360.6      327.3  -1.102  0.27531   
   `Time Costs`                 411.7      340.4   1.210  0.23152   
   `Informal charges`          -454.6      388.0  -1.172  0.24628   
   Proactivity                 -142.3      394.6  -0.361  0.71970   
   `Business Support Policy`    611.7      243.7   2.510  0.01499 * 
   `Labor Policy`               810.5      280.4   2.891  0.00546 **
   `Law & Order`               -668.3      444.6  -1.503  0.13846   

   ---Significance stars
   Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
   Residual standard error: 1425 on 56 degrees of freedom
   Multiple R-squared: 0.3885
   Adjusted R-squared: 0.2902 
   F-statistic: 3.953 on 9 and 56 DF,  p-value: 0.0006067 
   ***Extra Diagnostic information
   Residual sum of squares: 113728132
   Sigma(hat): 1333.042
   AIC:  1157.038
   AICc:  1161.927
   BIC:  1161.21
   ***********************************************************************
   *          Results of Geographically Weighted Regression              *
   ***********************************************************************

   *********************Model calibration information*********************
   Kernel function: gaussian 
   Adaptive bandwidth: 63 (number of nearest neighbours)
   Regression points: the same locations as observations are used.
   Distance metric: Euclidean distance metric is used.

   ****************Summary of GWR coefficient estimates:******************
                                Min. 1st Qu.  Median 3rd Qu.    Max.
   Intercept                  964.30 1111.39 1137.93 1224.62 1259.56
   `Entry Costs`             -513.95 -504.17 -502.25 -497.73 -495.93
   `Land Access`              193.14  196.99  274.82  303.54  304.67
   Transparency              -381.80 -379.22 -370.59 -342.36 -338.11
   `Time Costs`               407.79  408.81  419.72  420.90  423.19
   `Informal charges`        -498.61 -496.73 -465.44 -399.35 -394.40
   Proactivity               -161.78 -155.87 -128.67 -117.52 -115.66
   `Business Support Policy`  593.48  595.75  627.16  640.67  643.47
   `Labor Policy`             785.46  788.05  830.11  841.68  842.50
   `Law & Order`             -691.39 -690.82 -681.93 -658.21 -656.49
   ************************Diagnostic information*************************
   Number of data points: 66 
   Effective number of parameters (2trace(S) - trace(S'S)): 12.90588 
   Effective degrees of freedom (n-2trace(S) + trace(S'S)): 53.09412 
   AICc (GWR book, Fotheringham, et al. 2002, p. 61, eq 2.33): 1165.86 
   AIC (GWR book, Fotheringham, et al. 2002,GWR p. 96, eq. 4.22): 1145.776 
   BIC (GWR book, Fotheringham, et al. 2002,GWR p. 61, eq. 2.34): 1116.691 
   Residual sum of squares: 112298345 
   R-square value:  0.3962076 
   Adjusted R-square value:  0.2466231 

   ***********************************************************************
   Program stops at: 2024-11-12 17:10:54.186151 
bw.adaptive_project <- bw.gwr(formula = total_project_count ~ 
                                `Entry Costs` + `Land Access` + Transparency + 
                                `Time Costs` + `Informal charges` + Proactivity +
                                `Business Support Policy` + `Labor Policy` +
                                `Law & Order`,
                              data=pci_2021,
                              approach="CV", 
                              kernel="exponential", 
                              adaptive=TRUE, 
                              longlat=FALSE)
Adaptive bandwidth: 48 CV score: 185941144 
Adaptive bandwidth: 38 CV score: 195096488 
Adaptive bandwidth: 56 CV score: 184492890 
Adaptive bandwidth: 59 CV score: 183754487 
Adaptive bandwidth: 63 CV score: 182692753 
Adaptive bandwidth: 63 CV score: 182692753 
gwr.adaptive_project <- gwr.basic(formula = total_project_count ~ 
                                    `Entry Costs` + `Land Access` + Transparency +
                                    `Time Costs` + `Informal charges` + Proactivity + 
                                    `Business Support Policy` + `Labor Policy` +
                                    `Law & Order`,
                                  data=pci_2021,
                                  bw=bw.adaptive_project,
                                  kernel = 'exponential',
                                  adaptive=TRUE, 
                                  longlat = FALSE)

gwr.adaptive_project
   ***********************************************************************
   *                       Package   GWmodel                             *
   ***********************************************************************
   Program starts at: 2024-11-12 17:10:54.231847 
   Call:
   gwr.basic(formula = total_project_count ~ `Entry Costs` + `Land Access` + 
    Transparency + `Time Costs` + `Informal charges` + Proactivity + 
    `Business Support Policy` + `Labor Policy` + `Law & Order`, 
    data = pci_2021, bw = bw.adaptive_project, kernel = "exponential", 
    adaptive = TRUE, longlat = FALSE)

   Dependent (y) variable:  total_project_count
   Independent variables:  Entry Costs Land Access Transparency Time Costs Informal charges Proactivity Business Support Policy Labor Policy Law & Order
   Number of data points: 66
   ***********************************************************************
   *                    Results of Global Regression                     *
   ***********************************************************************

   Call:
    lm(formula = formula, data = data)

   Residuals:
    Min      1Q  Median      3Q     Max 
-1500.4  -720.6  -243.1   295.9  7609.3 

   Coefficients:
                             Estimate Std. Error t value Pr(>|t|)   
   (Intercept)                 1303.9     4351.0   0.300  0.76553   
   `Entry Costs`               -499.2      361.4  -1.381  0.17264   
   `Land Access`                250.5      453.1   0.553  0.58261   
   Transparency                -360.6      327.3  -1.102  0.27531   
   `Time Costs`                 411.7      340.4   1.210  0.23152   
   `Informal charges`          -454.6      388.0  -1.172  0.24628   
   Proactivity                 -142.3      394.6  -0.361  0.71970   
   `Business Support Policy`    611.7      243.7   2.510  0.01499 * 
   `Labor Policy`               810.5      280.4   2.891  0.00546 **
   `Law & Order`               -668.3      444.6  -1.503  0.13846   

   ---Significance stars
   Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
   Residual standard error: 1425 on 56 degrees of freedom
   Multiple R-squared: 0.3885
   Adjusted R-squared: 0.2902 
   F-statistic: 3.953 on 9 and 56 DF,  p-value: 0.0006067 
   ***Extra Diagnostic information
   Residual sum of squares: 113728132
   Sigma(hat): 1333.042
   AIC:  1157.038
   AICc:  1161.927
   BIC:  1161.21
   ***********************************************************************
   *          Results of Geographically Weighted Regression              *
   ***********************************************************************

   *********************Model calibration information*********************
   Kernel function: exponential 
   Adaptive bandwidth: 63 (number of nearest neighbours)
   Regression points: the same locations as observations are used.
   Distance metric: Euclidean distance metric is used.

   ****************Summary of GWR coefficient estimates:******************
                                 Min.  1st Qu.   Median  3rd Qu.     Max.
   Intercept                  729.348 1033.744 1236.558 1327.109 1510.898
   `Entry Costs`             -515.261 -498.998 -494.486 -489.609 -472.187
   `Land Access`              127.604  145.566  259.373  338.924  361.808
   Transparency              -425.159 -398.823 -360.362 -345.599 -320.200
   `Time Costs`               389.744  395.278  424.855  435.555  447.290
   `Informal charges`        -569.914 -542.686 -473.762 -329.692 -314.880
   Proactivity               -216.654 -187.354 -121.510  -85.051  -68.049
   `Business Support Policy`  568.207  585.715  617.608  679.571  708.992
   `Labor Policy`             753.540  777.060  813.820  872.923  905.538
   `Law & Order`             -748.855 -715.832 -699.987 -671.935 -649.054
   ************************Diagnostic information*************************
   Number of data points: 66 
   Effective number of parameters (2trace(S) - trace(S'S)): 18.04487 
   Effective degrees of freedom (n-2trace(S) + trace(S'S)): 47.95513 
   AICc (GWR book, Fotheringham, et al. 2002, p. 61, eq 2.33): 1171.172 
   AIC (GWR book, Fotheringham, et al. 2002,GWR p. 96, eq. 4.22): 1144.248 
   BIC (GWR book, Fotheringham, et al. 2002,GWR p. 61, eq. 2.34): 1124.613 
   Residual sum of squares: 104911409 
   R-square value:  0.4359248 
   Adjusted R-square value:  0.2191505 

   ***********************************************************************
   Program stops at: 2024-11-12 17:10:54.254945 
bw.adaptive_project <- bw.gwr(formula = total_project_count ~ 
                                `Entry Costs` + `Land Access` + Transparency + 
                                `Time Costs` + `Informal charges` + Proactivity +
                                `Business Support Policy` + `Labor Policy` +
                                `Law & Order`,
                              data=pci_2021,
                              approach="CV", 
                              kernel="bisquare", 
                              adaptive=TRUE, 
                              longlat=FALSE)
Adaptive bandwidth: 48 CV score: 248284089 
Adaptive bandwidth: 38 CV score: 271914275 
Adaptive bandwidth: 56 CV score: 241109629 
Adaptive bandwidth: 59 CV score: 235634296 
Adaptive bandwidth: 63 CV score: 226702587 
Adaptive bandwidth: 63 CV score: 226702587 
gwr.adaptive_project <- gwr.basic(formula = total_project_count ~ 
                                    `Entry Costs` + `Land Access` + Transparency +
                                    `Time Costs` + `Informal charges` + Proactivity + 
                                    `Business Support Policy` + `Labor Policy` +
                                    `Law & Order`,
                                  data=pci_2021,
                                  bw=bw.adaptive_project,
                                  kernel = 'bisquare',
                                  adaptive=TRUE, 
                                  longlat = FALSE)

gwr.adaptive_project
   ***********************************************************************
   *                       Package   GWmodel                             *
   ***********************************************************************
   Program starts at: 2024-11-12 17:10:54.300376 
   Call:
   gwr.basic(formula = total_project_count ~ `Entry Costs` + `Land Access` + 
    Transparency + `Time Costs` + `Informal charges` + Proactivity + 
    `Business Support Policy` + `Labor Policy` + `Law & Order`, 
    data = pci_2021, bw = bw.adaptive_project, kernel = "bisquare", 
    adaptive = TRUE, longlat = FALSE)

   Dependent (y) variable:  total_project_count
   Independent variables:  Entry Costs Land Access Transparency Time Costs Informal charges Proactivity Business Support Policy Labor Policy Law & Order
   Number of data points: 66
   ***********************************************************************
   *                    Results of Global Regression                     *
   ***********************************************************************

   Call:
    lm(formula = formula, data = data)

   Residuals:
    Min      1Q  Median      3Q     Max 
-1500.4  -720.6  -243.1   295.9  7609.3 

   Coefficients:
                             Estimate Std. Error t value Pr(>|t|)   
   (Intercept)                 1303.9     4351.0   0.300  0.76553   
   `Entry Costs`               -499.2      361.4  -1.381  0.17264   
   `Land Access`                250.5      453.1   0.553  0.58261   
   Transparency                -360.6      327.3  -1.102  0.27531   
   `Time Costs`                 411.7      340.4   1.210  0.23152   
   `Informal charges`          -454.6      388.0  -1.172  0.24628   
   Proactivity                 -142.3      394.6  -0.361  0.71970   
   `Business Support Policy`    611.7      243.7   2.510  0.01499 * 
   `Labor Policy`               810.5      280.4   2.891  0.00546 **
   `Law & Order`               -668.3      444.6  -1.503  0.13846   

   ---Significance stars
   Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
   Residual standard error: 1425 on 56 degrees of freedom
   Multiple R-squared: 0.3885
   Adjusted R-squared: 0.2902 
   F-statistic: 3.953 on 9 and 56 DF,  p-value: 0.0006067 
   ***Extra Diagnostic information
   Residual sum of squares: 113728132
   Sigma(hat): 1333.042
   AIC:  1157.038
   AICc:  1161.927
   BIC:  1161.21
   ***********************************************************************
   *          Results of Geographically Weighted Regression              *
   ***********************************************************************

   *********************Model calibration information*********************
   Kernel function: bisquare 
   Adaptive bandwidth: 63 (number of nearest neighbours)
   Regression points: the same locations as observations are used.
   Distance metric: Euclidean distance metric is used.

   ****************Summary of GWR coefficient estimates:******************
                                  Min.   1st Qu.    Median   3rd Qu.     Max.
   Intercept                 -1081.237  -203.587   197.521  1394.648 1627.165
   `Entry Costs`              -513.612  -411.986  -387.533  -364.058 -348.297
   `Land Access`              -156.219   -93.039   402.078   437.817  461.526
   Transparency               -454.085  -438.506  -363.930  -313.904 -269.018
   `Time Costs`                328.289   357.593   437.385   444.883  486.133
   `Informal charges`         -659.146  -627.250  -497.278   -87.940   32.517
   Proactivity                -239.107  -191.996   -86.649    43.004   72.855
   `Business Support Policy`   461.982   471.910   643.541   767.613  789.629
   `Labor Policy`              602.761   651.865   914.307   949.492  954.086
   `Law & Order`              -820.120  -807.011  -763.879  -691.577 -619.059
   ************************Diagnostic information*************************
   Number of data points: 66 
   Effective number of parameters (2trace(S) - trace(S'S)): 22.038 
   Effective degrees of freedom (n-2trace(S) + trace(S'S)): 43.962 
   AICc (GWR book, Fotheringham, et al. 2002, p. 61, eq 2.33): 1190.125 
   AIC (GWR book, Fotheringham, et al. 2002,GWR p. 96, eq. 4.22): 1150.32 
   BIC (GWR book, Fotheringham, et al. 2002,GWR p. 61, eq. 2.34): 1145.06 
   Residual sum of squares: 107429083 
   R-square value:  0.422388 
   Adjusted R-square value:  0.1260934 

   ***********************************************************************
   Program stops at: 2024-11-12 17:10:54.324153 
bw.adaptive_project <- bw.gwr(formula = total_project_count ~ 
                                `Entry Costs` + `Land Access` + Transparency + 
                                `Time Costs` + `Informal charges` + Proactivity +
                                `Business Support Policy` + `Labor Policy` +
                                `Law & Order`,
                              data=pci_2021,
                              approach="CV", 
                              kernel="tricube", 
                              adaptive=TRUE, 
                              longlat=FALSE)
Adaptive bandwidth: 48 CV score: 247273713 
Adaptive bandwidth: 38 CV score: 274399968 
Adaptive bandwidth: 56 CV score: 241297509 
Adaptive bandwidth: 59 CV score: 236478189 
Adaptive bandwidth: 63 CV score: 227983849 
Adaptive bandwidth: 63 CV score: 227983849 
gwr.adaptive_project <- gwr.basic(formula = total_project_count ~ 
                                    `Entry Costs` + `Land Access` + Transparency +
                                    `Time Costs` + `Informal charges` + Proactivity + 
                                    `Business Support Policy` + `Labor Policy` +
                                    `Law & Order`,
                                  data=pci_2021,
                                  bw=bw.adaptive_project,
                                  kernel = 'tricube',
                                  adaptive=TRUE, 
                                  longlat = FALSE)

gwr.adaptive_project
   ***********************************************************************
   *                       Package   GWmodel                             *
   ***********************************************************************
   Program starts at: 2024-11-12 17:10:54.371515 
   Call:
   gwr.basic(formula = total_project_count ~ `Entry Costs` + `Land Access` + 
    Transparency + `Time Costs` + `Informal charges` + Proactivity + 
    `Business Support Policy` + `Labor Policy` + `Law & Order`, 
    data = pci_2021, bw = bw.adaptive_project, kernel = "tricube", 
    adaptive = TRUE, longlat = FALSE)

   Dependent (y) variable:  total_project_count
   Independent variables:  Entry Costs Land Access Transparency Time Costs Informal charges Proactivity Business Support Policy Labor Policy Law & Order
   Number of data points: 66
   ***********************************************************************
   *                    Results of Global Regression                     *
   ***********************************************************************

   Call:
    lm(formula = formula, data = data)

   Residuals:
    Min      1Q  Median      3Q     Max 
-1500.4  -720.6  -243.1   295.9  7609.3 

   Coefficients:
                             Estimate Std. Error t value Pr(>|t|)   
   (Intercept)                 1303.9     4351.0   0.300  0.76553   
   `Entry Costs`               -499.2      361.4  -1.381  0.17264   
   `Land Access`                250.5      453.1   0.553  0.58261   
   Transparency                -360.6      327.3  -1.102  0.27531   
   `Time Costs`                 411.7      340.4   1.210  0.23152   
   `Informal charges`          -454.6      388.0  -1.172  0.24628   
   Proactivity                 -142.3      394.6  -0.361  0.71970   
   `Business Support Policy`    611.7      243.7   2.510  0.01499 * 
   `Labor Policy`               810.5      280.4   2.891  0.00546 **
   `Law & Order`               -668.3      444.6  -1.503  0.13846   

   ---Significance stars
   Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
   Residual standard error: 1425 on 56 degrees of freedom
   Multiple R-squared: 0.3885
   Adjusted R-squared: 0.2902 
   F-statistic: 3.953 on 9 and 56 DF,  p-value: 0.0006067 
   ***Extra Diagnostic information
   Residual sum of squares: 113728132
   Sigma(hat): 1333.042
   AIC:  1157.038
   AICc:  1161.927
   BIC:  1161.21
   ***********************************************************************
   *          Results of Geographically Weighted Regression              *
   ***********************************************************************

   *********************Model calibration information*********************
   Kernel function: tricube 
   Adaptive bandwidth: 63 (number of nearest neighbours)
   Regression points: the same locations as observations are used.
   Distance metric: Euclidean distance metric is used.

   ****************Summary of GWR coefficient estimates:******************
                                  Min.   1st Qu.    Median   3rd Qu.     Max.
   Intercept                 -1331.635  -277.087   -38.139  1303.534 1720.335
   `Entry Costs`              -534.984  -396.460  -371.017  -351.297 -338.501
   `Land Access`              -161.716  -115.294   421.328   439.149  468.591
   Transparency               -441.119  -414.386  -379.048  -297.199 -278.999
   `Time Costs`                317.992   355.733   436.066   438.763  487.691
   `Informal charges`         -663.654  -631.320  -503.250   -85.898   48.684
   Proactivity                -218.319  -194.544   -78.274    46.200   78.638
   `Business Support Policy`   432.985   447.396   654.825   753.477  771.723
   `Labor Policy`              591.925   635.435   910.986   926.960  961.392
   `Law & Order`              -817.849  -782.450  -766.167  -658.159 -617.871
   ************************Diagnostic information*************************
   Number of data points: 66 
   Effective number of parameters (2trace(S) - trace(S'S)): 20.6876 
   Effective degrees of freedom (n-2trace(S) + trace(S'S)): 45.3124 
   AICc (GWR book, Fotheringham, et al. 2002, p. 61, eq 2.33): 1189.23 
   AIC (GWR book, Fotheringham, et al. 2002,GWR p. 96, eq. 4.22): 1151.264 
   BIC (GWR book, Fotheringham, et al. 2002,GWR p. 61, eq. 2.34): 1144.165 
   Residual sum of squares: 109931637 
   R-square value:  0.4089326 
   Adjusted R-square value:  0.1329879 

   ***********************************************************************
   Program stops at: 2024-11-12 17:10:54.395251 
bw.adaptive_project <- bw.gwr(formula = total_project_count ~ 
                                `Entry Costs` + `Land Access` + Transparency + 
                                `Time Costs` + `Informal charges` + Proactivity +
                                `Business Support Policy` + `Labor Policy` +
                                `Law & Order`,
                              data=pci_2021,
                              approach="CV", 
                              kernel="boxcar", 
                              adaptive=TRUE, 
                              longlat=FALSE)
Adaptive bandwidth: 48 CV score: 204424128 
Adaptive bandwidth: 38 CV score: 235355686 
Adaptive bandwidth: 56 CV score: 181731389 
Adaptive bandwidth: 59 CV score: 174435740 
Adaptive bandwidth: 63 CV score: 168499922 
Adaptive bandwidth: 63 CV score: 168499922 
gwr.adaptive_project <- gwr.basic(formula = total_project_count ~ 
                                    `Entry Costs` + `Land Access` + Transparency +
                                    `Time Costs` + `Informal charges` + Proactivity + 
                                    `Business Support Policy` + `Labor Policy` +
                                    `Law & Order`,
                                  data=pci_2021,
                                  bw=bw.adaptive_project,
                                  kernel = 'boxcar',
                                  adaptive=TRUE, 
                                  longlat = FALSE)

gwr.adaptive_project
   ***********************************************************************
   *                       Package   GWmodel                             *
   ***********************************************************************
   Program starts at: 2024-11-12 17:10:54.441262 
   Call:
   gwr.basic(formula = total_project_count ~ `Entry Costs` + `Land Access` + 
    Transparency + `Time Costs` + `Informal charges` + Proactivity + 
    `Business Support Policy` + `Labor Policy` + `Law & Order`, 
    data = pci_2021, bw = bw.adaptive_project, kernel = "boxcar", 
    adaptive = TRUE, longlat = FALSE)

   Dependent (y) variable:  total_project_count
   Independent variables:  Entry Costs Land Access Transparency Time Costs Informal charges Proactivity Business Support Policy Labor Policy Law & Order
   Number of data points: 66
   ***********************************************************************
   *                    Results of Global Regression                     *
   ***********************************************************************

   Call:
    lm(formula = formula, data = data)

   Residuals:
    Min      1Q  Median      3Q     Max 
-1500.4  -720.6  -243.1   295.9  7609.3 

   Coefficients:
                             Estimate Std. Error t value Pr(>|t|)   
   (Intercept)                 1303.9     4351.0   0.300  0.76553   
   `Entry Costs`               -499.2      361.4  -1.381  0.17264   
   `Land Access`                250.5      453.1   0.553  0.58261   
   Transparency                -360.6      327.3  -1.102  0.27531   
   `Time Costs`                 411.7      340.4   1.210  0.23152   
   `Informal charges`          -454.6      388.0  -1.172  0.24628   
   Proactivity                 -142.3      394.6  -0.361  0.71970   
   `Business Support Policy`    611.7      243.7   2.510  0.01499 * 
   `Labor Policy`               810.5      280.4   2.891  0.00546 **
   `Law & Order`               -668.3      444.6  -1.503  0.13846   

   ---Significance stars
   Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
   Residual standard error: 1425 on 56 degrees of freedom
   Multiple R-squared: 0.3885
   Adjusted R-squared: 0.2902 
   F-statistic: 3.953 on 9 and 56 DF,  p-value: 0.0006067 
   ***Extra Diagnostic information
   Residual sum of squares: 113728132
   Sigma(hat): 1333.042
   AIC:  1157.038
   AICc:  1161.927
   BIC:  1161.21
   ***********************************************************************
   *          Results of Geographically Weighted Regression              *
   ***********************************************************************

   *********************Model calibration information*********************
   Kernel function: boxcar 
   Adaptive bandwidth: 63 (number of nearest neighbours)
   Regression points: the same locations as observations are used.
   Distance metric: Euclidean distance metric is used.

   ****************Summary of GWR coefficient estimates:******************
                                Min. 1st Qu.  Median 3rd Qu.     Max.
   Intercept                  214.29  214.29  971.88  971.88 1420.200
   `Entry Costs`             -575.15 -575.15 -561.30 -561.30 -463.273
   `Land Access`              221.39  221.39  289.30  353.31  353.307
   Transparency              -433.66 -353.89 -353.89 -333.83 -333.826
   `Time Costs`               359.70  414.10  414.10  468.73  468.725
   `Informal charges`        -487.83 -456.36 -456.36 -356.18 -356.180
   Proactivity               -187.82 -187.82 -140.26 -139.39  -48.202
   `Business Support Policy`  603.83  634.57  634.57  677.86  677.865
   `Labor Policy`             809.11  842.05  864.88  864.88  937.711
   `Law & Order`             -826.69 -737.49 -729.78 -637.65 -637.654
   ************************Diagnostic information*************************
   Number of data points: 66 
   Effective number of parameters (2trace(S) - trace(S'S)): 10.32786 
   Effective degrees of freedom (n-2trace(S) + trace(S'S)): 55.67214 
   AICc (GWR book, Fotheringham, et al. 2002, p. 61, eq 2.33): 1163.372 
   AIC (GWR book, Fotheringham, et al. 2002,GWR p. 96, eq. 4.22): 1145.841 
   BIC (GWR book, Fotheringham, et al. 2002,GWR p. 61, eq. 2.34): 1112.783 
   Residual sum of squares: 114549287 
   R-square value:  0.3841049 
   Adjusted R-square value:  0.267759 

   ***********************************************************************
   Program stops at: 2024-11-12 17:10:54.464465 
kernel_comparison <- read_xlsx("data/rds/project_local_adaptive_kernel_comparison.xlsx")

ggplot(kernel_comparison, aes(x = reorder(Kernel, `Adjusted R2`), y = `Adjusted R2`)) +
  geom_bar(stat = "identity", fill = "steelblue", color = "black") +
  labs(title = "Comparison of Kernel method",
       x = "Kernel",
       y = "Adjusted R2") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))  # Rotates x-axis labels if needed

Note

We will pick the best performing selection from each bandwidth and compare it.

bandwidth_comparison <- read_xlsx("data/rds/project_local_bandwidth_comparison.xlsx")

ggplot(bandwidth_comparison, aes(x = Bandwidth, y = `Adjusted R2`)) +
  geom_bar(stat = "identity", fill = "steelblue", color = "black") +
  labs(title = "Comparison of Approaches",
       x = "Bandwidth",
       y = "Adjusted R2") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))  # Rotates x-axis labels if needed

Note

After conducting the comparison, we observed that the best-performing selection using Fixed bandwidth outperforms the top selection from Adaptive bandwidth. Additionally, among the various approaches and kernels, the ‘CV’ approach combined with the ‘boxcar’ kernel consistently delivered the best performance, yielding the highest Adjusted R². This indicates that Fixed bandwidth, along with the ‘CV’ approach and ‘boxcar’ kernel, offers the most effective configuration for this analysis.

Note

After evaluating the various approaches and kernel types, we found that there were no differences in performance across the different approaches.

However, when it came to kernel selection, the ‘boxcar’ kernel consistently outperformed the others, achieving the highest Adjusted R². This suggests that, for this particular dataset and modeling context, the ‘boxcar’ kernel offers the most effective fit.

bw.fixed_capital <- bw.gwr(formula = total_registered_capital ~ 
                             `Entry Costs` + `Land Access` + Transparency + 
                             `Time Costs` + `Informal charges` + Proactivity + 
                             `Business Support Policy` + `Labor Policy` +
                             `Law & Order`,
                           data=pci_2021,
                           approach="CV", 
                           kernel="bisquare", 
                           adaptive=FALSE, 
                           longlat=FALSE)
Fixed bandwidth: 968784.2 CV score: 7783319735 
Fixed bandwidth: 598861.3 CV score: 8456546162 
Fixed bandwidth: 1197409 CV score: 6907612208 
Fixed bandwidth: 1338707 CV score: 6394763233 
Fixed bandwidth: 1426034 CV score: 6166490448 
Fixed bandwidth: 1480005 CV score: 6058968306 
Fixed bandwidth: 1513361 CV score: 6003759868 
Fixed bandwidth: 1533976 CV score: 5973544648 
Fixed bandwidth: 1546717 CV score: 5956273318 
Fixed bandwidth: 1554591 CV score: 5946105624 
Fixed bandwidth: 1559458 CV score: 5940013121 
Fixed bandwidth: 1562465 CV score: 5936321848 
Fixed bandwidth: 1564324 CV score: 5934068480 
Fixed bandwidth: 1565473 CV score: 5932686416 
Fixed bandwidth: 1566183 CV score: 5931836280 
Fixed bandwidth: 1566622 CV score: 5931312400 
Fixed bandwidth: 1566893 CV score: 5930989208 
Fixed bandwidth: 1567061 CV score: 5930789688 
Fixed bandwidth: 1567164 CV score: 5930666463 
Fixed bandwidth: 1567228 CV score: 5930590338 
Fixed bandwidth: 1567268 CV score: 5930543303 
Fixed bandwidth: 1567292 CV score: 5930514238 
Fixed bandwidth: 1567308 CV score: 5930496277 
Fixed bandwidth: 1567317 CV score: 5930485177 
Fixed bandwidth: 1567323 CV score: 5930478317 
Fixed bandwidth: 1567326 CV score: 5930474077 
Fixed bandwidth: 1567328 CV score: 5930471457 
Fixed bandwidth: 1567330 CV score: 5930469838 
Fixed bandwidth: 1567331 CV score: 5930468837 
Fixed bandwidth: 1567331 CV score: 5930468219 
Fixed bandwidth: 1567331 CV score: 5930467836 
Fixed bandwidth: 1567332 CV score: 5930467600 
Fixed bandwidth: 1567332 CV score: 5930467454 
Fixed bandwidth: 1567332 CV score: 5930467364 
Fixed bandwidth: 1567332 CV score: 5930467308 
Fixed bandwidth: 1567332 CV score: 5930467274 
Fixed bandwidth: 1567332 CV score: 5930467252 
Fixed bandwidth: 1567332 CV score: 5930467239 
Fixed bandwidth: 1567332 CV score: 5930467231 
Fixed bandwidth: 1567332 CV score: 5930467226 
Fixed bandwidth: 1567332 CV score: 5930467223 
Fixed bandwidth: 1567332 CV score: 5930467221 
Fixed bandwidth: 1567332 CV score: 5930467220 
Fixed bandwidth: 1567332 CV score: 5930467219 
Fixed bandwidth: 1567332 CV score: 5930467219 
Fixed bandwidth: 1567332 CV score: 5930467218 
Fixed bandwidth: 1567332 CV score: 5930467218 
Fixed bandwidth: 1567332 CV score: 5930467218 
Fixed bandwidth: 1567332 CV score: 5930467218 
Fixed bandwidth: 1567332 CV score: 5930467218 
gwr.fixed_capital <- gwr.basic(formula = total_registered_capital ~ 
                                 `Entry Costs` + `Land Access` + Transparency + 
                                 `Time Costs` + `Informal charges` + Proactivity + 
                                 `Business Support Policy` + `Labor Policy` +
                                 `Law & Order`,
                               data=pci_2021,
                               bw=bw.fixed_capital, 
                               kernel = 'bisquare', 
                               longlat = FALSE)

gwr.fixed_capital
   ***********************************************************************
   *                       Package   GWmodel                             *
   ***********************************************************************
   Program starts at: 2024-11-12 17:10:54.902668 
   Call:
   gwr.basic(formula = total_registered_capital ~ `Entry Costs` + 
    `Land Access` + Transparency + `Time Costs` + `Informal charges` + 
    Proactivity + `Business Support Policy` + `Labor Policy` + 
    `Law & Order`, data = pci_2021, bw = bw.fixed_capital, kernel = "bisquare", 
    longlat = FALSE)

   Dependent (y) variable:  total_registered_capital
   Independent variables:  Entry Costs Land Access Transparency Time Costs Informal charges Proactivity Business Support Policy Labor Policy Law & Order
   Number of data points: 66
   ***********************************************************************
   *                    Results of Global Regression                     *
   ***********************************************************************

   Call:
    lm(formula = formula, data = data)

   Residuals:
   Min     1Q Median     3Q    Max 
-14578  -6219  -1215   5880  22546 

   Coefficients:
                             Estimate Std. Error t value Pr(>|t|)    
   (Intercept)                 -34440      26619  -1.294 0.201028    
   `Entry Costs`                -4289       2211  -1.940 0.057402 .  
   `Land Access`                 2175       2772   0.785 0.435874    
   Transparency                   428       2002   0.214 0.831530    
   `Time Costs`                  5400       2082   2.593 0.012103 *  
   `Informal charges`           -3822       2374  -1.610 0.112990    
   Proactivity                  -2636       2414  -1.092 0.279523    
   `Business Support Policy`     5736       1491   3.847 0.000309 ***
   `Labor Policy`                6999       1715   4.081 0.000144 ***
   `Law & Order`                -2987       2720  -1.098 0.276892    

   ---Significance stars
   Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
   Residual standard error: 8718 on 56 degrees of freedom
   Multiple R-squared:  0.58
   Adjusted R-squared: 0.5125 
   F-statistic: 8.591 on 9 and 56 DF,  p-value: 6.006e-08 
   ***Extra Diagnostic information
   Residual sum of squares: 4256608383
   Sigma(hat): 8155.336
   AIC:  1396.117
   AICc:  1401.006
   BIC:  1400.29
   ***********************************************************************
   *          Results of Geographically Weighted Regression              *
   ***********************************************************************

   *********************Model calibration information*********************
   Kernel function: bisquare 
   Fixed bandwidth: 1567332 
   Regression points: the same locations as observations are used.
   Distance metric: Euclidean distance metric is used.

   ****************Summary of GWR coefficient estimates:******************
                                   Min.    1st Qu.     Median    3rd Qu.
   Intercept                 -39656.036 -37709.358 -36511.735 -35259.621
   `Entry Costs`              -7766.261  -6520.588  -4431.855  -2230.021
   `Land Access`                -43.236    692.711   2153.727   3568.962
   Transparency                -696.951   -242.575    385.228    604.899
   `Time Costs`                3700.055   4149.172   5788.037   7345.897
   `Informal charges`         -3779.846  -3549.856  -3434.954  -3209.836
   Proactivity                -4031.731  -3460.458  -2426.955   -890.564
   `Business Support Policy`   3464.883   4153.988   5602.217   5987.741
   `Labor Policy`              6390.413   6490.595   6778.570   7458.271
   `Law & Order`              -4128.646  -3918.244  -3182.281  -2150.035
                                  Max.
   Intercept                 -34262.35
   `Entry Costs`              -1482.41
   `Land Access`               4334.98
   Transparency                 721.93
   `Time Costs`                8232.04
   `Informal charges`         -2711.26
   Proactivity                 -297.88
   `Business Support Policy`   6071.96
   `Labor Policy`              8111.08
   `Law & Order`              -1610.82
   ************************Diagnostic information*************************
   Number of data points: 66 
   Effective number of parameters (2trace(S) - trace(S'S)): 17.98284 
   Effective degrees of freedom (n-2trace(S) + trace(S'S)): 48.01716 
   AICc (GWR book, Fotheringham, et al. 2002, p. 61, eq 2.33): 1406.215 
   AIC (GWR book, Fotheringham, et al. 2002,GWR p. 96, eq. 4.22): 1377.055 
   BIC (GWR book, Fotheringham, et al. 2002,GWR p. 61, eq. 2.34): 1360.196 
   Residual sum of squares: 3523822174 
   R-square value:  0.6522746 
   Adjusted R-square value:  0.5192787 

   ***********************************************************************
   Program stops at: 2024-11-12 17:10:54.926684 
bw.fixed_capital <- bw.gwr(formula = total_registered_capital ~ 
                             `Entry Costs` + `Land Access` + Transparency + 
                             `Time Costs` + `Informal charges` + Proactivity + 
                             `Business Support Policy` + `Labor Policy` +
                             `Law & Order`,
                           data=pci_2021,
                           approach="AIC", 
                           kernel="bisquare", 
                           adaptive=FALSE, 
                           longlat=FALSE)
Fixed bandwidth: 968784.2 AICc value: 1422.128 
Fixed bandwidth: 598861.3 AICc value: 1442.603 
Fixed bandwidth: 1197409 AICc value: 1416.095 
Fixed bandwidth: 1338707 AICc value: 1411.844 
Fixed bandwidth: 1426034 AICc value: 1409.374 
Fixed bandwidth: 1480005 AICc value: 1408.026 
Fixed bandwidth: 1513361 AICc value: 1407.278 
Fixed bandwidth: 1533976 AICc value: 1406.851 
Fixed bandwidth: 1546717 AICc value: 1406.6 
Fixed bandwidth: 1554591 AICc value: 1406.449 
Fixed bandwidth: 1559458 AICc value: 1406.359 
Fixed bandwidth: 1562465 AICc value: 1406.303 
Fixed bandwidth: 1564324 AICc value: 1406.269 
Fixed bandwidth: 1565473 AICc value: 1406.249 
Fixed bandwidth: 1566183 AICc value: 1406.236 
Fixed bandwidth: 1566622 AICc value: 1406.228 
Fixed bandwidth: 1566893 AICc value: 1406.223 
Fixed bandwidth: 1567061 AICc value: 1406.22 
Fixed bandwidth: 1567164 AICc value: 1406.218 
Fixed bandwidth: 1567228 AICc value: 1406.217 
Fixed bandwidth: 1567268 AICc value: 1406.216 
Fixed bandwidth: 1567292 AICc value: 1406.216 
Fixed bandwidth: 1567308 AICc value: 1406.215 
Fixed bandwidth: 1567317 AICc value: 1406.215 
Fixed bandwidth: 1567323 AICc value: 1406.215 
Fixed bandwidth: 1567326 AICc value: 1406.215 
gwr.fixed_capital <- gwr.basic(formula = total_registered_capital ~ 
                                 `Entry Costs` + `Land Access` + Transparency + 
                                 `Time Costs` + `Informal charges` + Proactivity + 
                                 `Business Support Policy` + `Labor Policy` +
                                 `Law & Order`,
                               data=pci_2021,
                               bw=bw.fixed_capital, 
                               kernel = 'bisquare', 
                               longlat = FALSE)

gwr.fixed_capital
   ***********************************************************************
   *                       Package   GWmodel                             *
   ***********************************************************************
   Program starts at: 2024-11-12 17:10:54.991242 
   Call:
   gwr.basic(formula = total_registered_capital ~ `Entry Costs` + 
    `Land Access` + Transparency + `Time Costs` + `Informal charges` + 
    Proactivity + `Business Support Policy` + `Labor Policy` + 
    `Law & Order`, data = pci_2021, bw = bw.fixed_capital, kernel = "bisquare", 
    longlat = FALSE)

   Dependent (y) variable:  total_registered_capital
   Independent variables:  Entry Costs Land Access Transparency Time Costs Informal charges Proactivity Business Support Policy Labor Policy Law & Order
   Number of data points: 66
   ***********************************************************************
   *                    Results of Global Regression                     *
   ***********************************************************************

   Call:
    lm(formula = formula, data = data)

   Residuals:
   Min     1Q Median     3Q    Max 
-14578  -6219  -1215   5880  22546 

   Coefficients:
                             Estimate Std. Error t value Pr(>|t|)    
   (Intercept)                 -34440      26619  -1.294 0.201028    
   `Entry Costs`                -4289       2211  -1.940 0.057402 .  
   `Land Access`                 2175       2772   0.785 0.435874    
   Transparency                   428       2002   0.214 0.831530    
   `Time Costs`                  5400       2082   2.593 0.012103 *  
   `Informal charges`           -3822       2374  -1.610 0.112990    
   Proactivity                  -2636       2414  -1.092 0.279523    
   `Business Support Policy`     5736       1491   3.847 0.000309 ***
   `Labor Policy`                6999       1715   4.081 0.000144 ***
   `Law & Order`                -2987       2720  -1.098 0.276892    

   ---Significance stars
   Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
   Residual standard error: 8718 on 56 degrees of freedom
   Multiple R-squared:  0.58
   Adjusted R-squared: 0.5125 
   F-statistic: 8.591 on 9 and 56 DF,  p-value: 6.006e-08 
   ***Extra Diagnostic information
   Residual sum of squares: 4256608383
   Sigma(hat): 8155.336
   AIC:  1396.117
   AICc:  1401.006
   BIC:  1400.29
   ***********************************************************************
   *          Results of Geographically Weighted Regression              *
   ***********************************************************************

   *********************Model calibration information*********************
   Kernel function: bisquare 
   Fixed bandwidth: 1567326 
   Regression points: the same locations as observations are used.
   Distance metric: Euclidean distance metric is used.

   ****************Summary of GWR coefficient estimates:******************
                                   Min.    1st Qu.     Median    3rd Qu.
   Intercept                 -39656.095 -37709.385 -36511.795 -35259.664
   `Entry Costs`              -7766.282  -6520.609  -4431.856  -2230.005
   `Land Access`                -43.257    692.696   2153.727   3568.976
   Transparency                -696.960   -242.583    385.227    604.900
   `Time Costs`                3700.047   4149.164   5788.041   7345.914
   `Informal charges`         -3779.846  -3549.855  -3434.950  -3209.829
   Proactivity                -4031.743  -3460.464  -2426.953   -890.550
   `Business Support Policy`   3464.867   4153.972   5602.215   5987.738
   `Labor Policy`              6390.411   6490.595   6778.568   7458.278
   `Law & Order`              -4128.650  -3918.252  -3182.283  -2150.024
                                  Max.
   Intercept                 -34262.35
   `Entry Costs`              -1482.40
   `Land Access`               4334.99
   Transparency                 721.93
   `Time Costs`                8232.06
   `Informal charges`         -2711.25
   Proactivity                 -297.87
   `Business Support Policy`   6071.96
   `Labor Policy`              8111.10
   `Law & Order`              -1610.80
   ************************Diagnostic information*************************
   Number of data points: 66 
   Effective number of parameters (2trace(S) - trace(S'S)): 17.98289 
   Effective degrees of freedom (n-2trace(S) + trace(S'S)): 48.01711 
   AICc (GWR book, Fotheringham, et al. 2002, p. 61, eq 2.33): 1406.215 
   AIC (GWR book, Fotheringham, et al. 2002,GWR p. 96, eq. 4.22): 1377.055 
   BIC (GWR book, Fotheringham, et al. 2002,GWR p. 61, eq. 2.34): 1360.196 
   Residual sum of squares: 3523818433 
   R-square value:  0.652275 
   Adjusted R-square value:  0.5192787 

   ***********************************************************************
   Program stops at: 2024-11-12 17:10:55.014587 
approach_comparison <- read_xlsx("data/rds/capital_local_fixed_approach_comparison.xlsx")

ggplot(approach_comparison, aes(x = Approach, y = `Adjusted R2`)) +
  geom_bar(stat = "identity", fill = "steelblue", color = "black") +
  labs(title = "Comparison of Approaches",
       x = "Approach",
       y = "Adjusted R2") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

bw.fixed_capital <- bw.gwr(formula = total_registered_capital ~ 
                             `Entry Costs` + `Land Access` + Transparency + 
                             `Time Costs` + `Informal charges` + Proactivity + 
                             `Business Support Policy` + `Labor Policy` +
                             `Law & Order`,
                           data=pci_2021,
                           approach="CV", 
                           kernel="gaussian", 
                           adaptive=FALSE, 
                           longlat=FALSE)
Fixed bandwidth: 968784.2 CV score: 5737865166 
Fixed bandwidth: 598861.3 CV score: 6087138335 
Fixed bandwidth: 1197409 CV score: 5730475390 
Fixed bandwidth: 1338707 CV score: 5734993763 
Fixed bandwidth: 1110082 CV score: 5729795578 
Fixed bandwidth: 1056111 CV score: 5731036307 
Fixed bandwidth: 1143438 CV score: 5729747753 
Fixed bandwidth: 1164053 CV score: 5729924548 
Fixed bandwidth: 1130697 CV score: 5729712983 
Fixed bandwidth: 1122823 CV score: 5729723175 
Fixed bandwidth: 1135564 CV score: 5729719033 
Fixed bandwidth: 1127690 CV score: 5729713901 
Fixed bandwidth: 1132556 CV score: 5729714208 
Fixed bandwidth: 1129548 CV score: 5729712907 
Fixed bandwidth: 1128838 CV score: 5729713123 
Fixed bandwidth: 1129987 CV score: 5729712874 
Fixed bandwidth: 1130258 CV score: 5729712892 
Fixed bandwidth: 1129820 CV score: 5729712878 
Fixed bandwidth: 1130091 CV score: 5729712878 
Fixed bandwidth: 1129923 CV score: 5729712874 
Fixed bandwidth: 1130027 CV score: 5729712875 
Fixed bandwidth: 1129963 CV score: 5729712874 
Fixed bandwidth: 1129948 CV score: 5729712874 
Fixed bandwidth: 1129972 CV score: 5729712874 
Fixed bandwidth: 1129957 CV score: 5729712874 
Fixed bandwidth: 1129953 CV score: 5729712874 
Fixed bandwidth: 1129959 CV score: 5729712874 
Fixed bandwidth: 1129956 CV score: 5729712874 
Fixed bandwidth: 1129955 CV score: 5729712874 
Fixed bandwidth: 1129956 CV score: 5729712874 
Fixed bandwidth: 1129956 CV score: 5729712874 
Fixed bandwidth: 1129956 CV score: 5729712874 
Fixed bandwidth: 1129956 CV score: 5729712874 
Fixed bandwidth: 1129956 CV score: 5729712874 
Fixed bandwidth: 1129956 CV score: 5729712874 
Fixed bandwidth: 1129956 CV score: 5729712874 
Fixed bandwidth: 1129956 CV score: 5729712874 
gwr.fixed_capital <- gwr.basic(formula = total_registered_capital ~ 
                                 `Entry Costs` + `Land Access` + Transparency + 
                                 `Time Costs` + `Informal charges` + Proactivity + 
                                 `Business Support Policy` + `Labor Policy` +
                                 `Law & Order`,
                               data=pci_2021,
                               bw=bw.fixed_capital, 
                               kernel = 'gaussian', 
                               longlat = FALSE)

gwr.fixed_capital
   ***********************************************************************
   *                       Package   GWmodel                             *
   ***********************************************************************
   Program starts at: 2024-11-12 17:10:55.251039 
   Call:
   gwr.basic(formula = total_registered_capital ~ `Entry Costs` + 
    `Land Access` + Transparency + `Time Costs` + `Informal charges` + 
    Proactivity + `Business Support Policy` + `Labor Policy` + 
    `Law & Order`, data = pci_2021, bw = bw.fixed_capital, kernel = "gaussian", 
    longlat = FALSE)

   Dependent (y) variable:  total_registered_capital
   Independent variables:  Entry Costs Land Access Transparency Time Costs Informal charges Proactivity Business Support Policy Labor Policy Law & Order
   Number of data points: 66
   ***********************************************************************
   *                    Results of Global Regression                     *
   ***********************************************************************

   Call:
    lm(formula = formula, data = data)

   Residuals:
   Min     1Q Median     3Q    Max 
-14578  -6219  -1215   5880  22546 

   Coefficients:
                             Estimate Std. Error t value Pr(>|t|)    
   (Intercept)                 -34440      26619  -1.294 0.201028    
   `Entry Costs`                -4289       2211  -1.940 0.057402 .  
   `Land Access`                 2175       2772   0.785 0.435874    
   Transparency                   428       2002   0.214 0.831530    
   `Time Costs`                  5400       2082   2.593 0.012103 *  
   `Informal charges`           -3822       2374  -1.610 0.112990    
   Proactivity                  -2636       2414  -1.092 0.279523    
   `Business Support Policy`     5736       1491   3.847 0.000309 ***
   `Labor Policy`                6999       1715   4.081 0.000144 ***
   `Law & Order`                -2987       2720  -1.098 0.276892    

   ---Significance stars
   Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
   Residual standard error: 8718 on 56 degrees of freedom
   Multiple R-squared:  0.58
   Adjusted R-squared: 0.5125 
   F-statistic: 8.591 on 9 and 56 DF,  p-value: 6.006e-08 
   ***Extra Diagnostic information
   Residual sum of squares: 4256608383
   Sigma(hat): 8155.336
   AIC:  1396.117
   AICc:  1401.006
   BIC:  1400.29
   ***********************************************************************
   *          Results of Geographically Weighted Regression              *
   ***********************************************************************

   *********************Model calibration information*********************
   Kernel function: gaussian 
   Fixed bandwidth: 1129956 
   Regression points: the same locations as observations are used.
   Distance metric: Euclidean distance metric is used.

   ****************Summary of GWR coefficient estimates:******************
                                  Min.   1st Qu.    Median   3rd Qu.      Max.
   Intercept                 -36442.88 -35813.53 -34406.09 -33671.47 -33399.24
   `Entry Costs`              -5373.30  -5048.72  -4346.66  -3509.62  -3259.64
   `Land Access`               1526.74   1687.72   2152.75   2606.14   2794.52
   Transparency                 161.09    248.29    415.28    524.61    549.85
   `Time Costs`                4662.12   4865.49   5559.33   6088.97   6332.03
   `Informal charges`         -3821.28  -3793.66  -3769.98  -3689.69  -3610.82
   Proactivity                -3207.09  -3014.99  -2543.55  -1992.60  -1819.85
   `Business Support Policy`   5128.31   5275.31   5681.88   5945.68   6031.39
   `Labor Policy`              6720.15   6751.88   6904.11   7117.54   7237.44
   `Law & Order`              -3394.17  -3307.61  -3062.09  -2751.10  -2643.56
   ************************Diagnostic information*************************
   Number of data points: 66 
   Effective number of parameters (2trace(S) - trace(S'S)): 13.53215 
   Effective degrees of freedom (n-2trace(S) + trace(S'S)): 52.46785 
   AICc (GWR book, Fotheringham, et al. 2002, p. 61, eq 2.33): 1401.54 
   AIC (GWR book, Fotheringham, et al. 2002,GWR p. 96, eq. 4.22): 1380.601 
   BIC (GWR book, Fotheringham, et al. 2002,GWR p. 61, eq. 2.34): 1352.787 
   Residual sum of squares: 3916958197 
   R-square value:  0.6134806 
   Adjusted R-square value:  0.5118552 

   ***********************************************************************
   Program stops at: 2024-11-12 17:10:55.274267 
bw.fixed_capital <- bw.gwr(formula = total_registered_capital ~ 
                             `Entry Costs` + `Land Access` + Transparency + 
                             `Time Costs` + `Informal charges` + Proactivity + 
                             `Business Support Policy` + `Labor Policy` +
                             `Law & Order`,
                           data=pci_2021,
                           approach="CV", 
                           kernel="exponential", 
                           adaptive=FALSE, 
                           longlat=FALSE)
Fixed bandwidth: 968784.2 CV score: 5749638999 
Fixed bandwidth: 598861.3 CV score: 5952335355 
Fixed bandwidth: 1197409 CV score: 5720032301 
Fixed bandwidth: 1338707 CV score: 5712472421 
Fixed bandwidth: 1426034 CV score: 5709966462 
Fixed bandwidth: 1480005 CV score: 5708991040 
Fixed bandwidth: 1513361 CV score: 5708564088 
Fixed bandwidth: 1533976 CV score: 5708359023 
Fixed bandwidth: 1546717 CV score: 5708252988 
Fixed bandwidth: 1554591 CV score: 5708194974 
Fixed bandwidth: 1559458 CV score: 5708161905 
Fixed bandwidth: 1562465 CV score: 5708142511 
Fixed bandwidth: 1564324 CV score: 5708130918 
Fixed bandwidth: 1565473 CV score: 5708123903 
Fixed bandwidth: 1566183 CV score: 5708119624 
Fixed bandwidth: 1566622 CV score: 5708117002 
Fixed bandwidth: 1566893 CV score: 5708115389 
Fixed bandwidth: 1567061 CV score: 5708114395 
Fixed bandwidth: 1567164 CV score: 5708113783 
Fixed bandwidth: 1567228 CV score: 5708113404 
Fixed bandwidth: 1567268 CV score: 5708113171 
Fixed bandwidth: 1567292 CV score: 5708113026 
Fixed bandwidth: 1567308 CV score: 5708112937 
Fixed bandwidth: 1567317 CV score: 5708112882 
Fixed bandwidth: 1567323 CV score: 5708112848 
Fixed bandwidth: 1567326 CV score: 5708112827 
Fixed bandwidth: 1567328 CV score: 5708112814 
Fixed bandwidth: 1567330 CV score: 5708112806 
Fixed bandwidth: 1567331 CV score: 5708112801 
Fixed bandwidth: 1567331 CV score: 5708112798 
Fixed bandwidth: 1567331 CV score: 5708112796 
Fixed bandwidth: 1567332 CV score: 5708112795 
Fixed bandwidth: 1567332 CV score: 5708112794 
Fixed bandwidth: 1567332 CV score: 5708112794 
Fixed bandwidth: 1567332 CV score: 5708112793 
Fixed bandwidth: 1567332 CV score: 5708112793 
Fixed bandwidth: 1567332 CV score: 5708112793 
Fixed bandwidth: 1567332 CV score: 5708112793 
Fixed bandwidth: 1567332 CV score: 5708112793 
Fixed bandwidth: 1567332 CV score: 5708112793 
Fixed bandwidth: 1567332 CV score: 5708112793 
Fixed bandwidth: 1567332 CV score: 5708112793 
Fixed bandwidth: 1567332 CV score: 5708112793 
Fixed bandwidth: 1567332 CV score: 5708112793 
Fixed bandwidth: 1567332 CV score: 5708112793 
Fixed bandwidth: 1567332 CV score: 5708112793 
Fixed bandwidth: 1567332 CV score: 5708112793 
Fixed bandwidth: 1567332 CV score: 5708112793 
Fixed bandwidth: 1567332 CV score: 5708112793 
Fixed bandwidth: 1567332 CV score: 5708112793 
gwr.fixed_capital <- gwr.basic(formula = total_registered_capital ~ 
                                 `Entry Costs` + `Land Access` + Transparency + 
                                 `Time Costs` + `Informal charges` + Proactivity + 
                                 `Business Support Policy` + `Labor Policy` +
                                 `Law & Order`,
                               data=pci_2021,
                               bw=bw.fixed_capital, 
                               kernel = 'exponential', 
                               longlat = FALSE)

gwr.fixed_capital
   ***********************************************************************
   *                       Package   GWmodel                             *
   ***********************************************************************
   Program starts at: 2024-11-12 17:10:55.345223 
   Call:
   gwr.basic(formula = total_registered_capital ~ `Entry Costs` + 
    `Land Access` + Transparency + `Time Costs` + `Informal charges` + 
    Proactivity + `Business Support Policy` + `Labor Policy` + 
    `Law & Order`, data = pci_2021, bw = bw.fixed_capital, kernel = "exponential", 
    longlat = FALSE)

   Dependent (y) variable:  total_registered_capital
   Independent variables:  Entry Costs Land Access Transparency Time Costs Informal charges Proactivity Business Support Policy Labor Policy Law & Order
   Number of data points: 66
   ***********************************************************************
   *                    Results of Global Regression                     *
   ***********************************************************************

   Call:
    lm(formula = formula, data = data)

   Residuals:
   Min     1Q Median     3Q    Max 
-14578  -6219  -1215   5880  22546 

   Coefficients:
                             Estimate Std. Error t value Pr(>|t|)    
   (Intercept)                 -34440      26619  -1.294 0.201028    
   `Entry Costs`                -4289       2211  -1.940 0.057402 .  
   `Land Access`                 2175       2772   0.785 0.435874    
   Transparency                   428       2002   0.214 0.831530    
   `Time Costs`                  5400       2082   2.593 0.012103 *  
   `Informal charges`           -3822       2374  -1.610 0.112990    
   Proactivity                  -2636       2414  -1.092 0.279523    
   `Business Support Policy`     5736       1491   3.847 0.000309 ***
   `Labor Policy`                6999       1715   4.081 0.000144 ***
   `Law & Order`                -2987       2720  -1.098 0.276892    

   ---Significance stars
   Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
   Residual standard error: 8718 on 56 degrees of freedom
   Multiple R-squared:  0.58
   Adjusted R-squared: 0.5125 
   F-statistic: 8.591 on 9 and 56 DF,  p-value: 6.006e-08 
   ***Extra Diagnostic information
   Residual sum of squares: 4256608383
   Sigma(hat): 8155.336
   AIC:  1396.117
   AICc:  1401.006
   BIC:  1400.29
   ***********************************************************************
   *          Results of Geographically Weighted Regression              *
   ***********************************************************************

   *********************Model calibration information*********************
   Kernel function: exponential 
   Fixed bandwidth: 1567332 
   Regression points: the same locations as observations are used.
   Distance metric: Euclidean distance metric is used.

   ****************Summary of GWR coefficient estimates:******************
                                  Min.   1st Qu.    Median   3rd Qu.      Max.
   Intercept                 -37674.51 -37002.68 -34408.91 -33076.85 -32079.02
   `Entry Costs`              -5459.98  -5231.93  -4339.69  -3426.24  -3216.01
   `Land Access`               1512.46   1683.56   1983.29   2790.32   2907.19
   Transparency                 150.43    211.77    436.54    518.19    572.03
   `Time Costs`                4542.42   4653.56   5626.28   6219.22   6345.88
   `Informal charges`         -3935.85  -3853.14  -3806.35  -3604.37  -3533.06
   Proactivity                -3439.06  -3322.38  -2491.84  -1894.33  -1845.40
   `Business Support Policy`   5195.16   5314.36   5593.87   6085.26   6141.98
   `Labor Policy`              6657.08   6811.45   6869.03   7283.24   7381.61
   `Law & Order`              -3460.04  -3363.12  -3045.02  -2728.29  -2606.20
   ************************Diagnostic information*************************
   Number of data points: 66 
   Effective number of parameters (2trace(S) - trace(S'S)): 16.41489 
   Effective degrees of freedom (n-2trace(S) + trace(S'S)): 49.58511 
   AICc (GWR book, Fotheringham, et al. 2002, p. 61, eq 2.33): 1402.409 
   AIC (GWR book, Fotheringham, et al. 2002,GWR p. 96, eq. 4.22): 1377.872 
   BIC (GWR book, Fotheringham, et al. 2002,GWR p. 61, eq. 2.34): 1355.116 
   Residual sum of squares: 3669069821 
   R-square value:  0.6379418 
   Adjusted R-square value:  0.5156174 

   ***********************************************************************
   Program stops at: 2024-11-12 17:10:55.369303 
bw.fixed_capital <- bw.gwr(formula = total_registered_capital ~ 
                             `Entry Costs` + `Land Access` + Transparency + 
                             `Time Costs` + `Informal charges` + Proactivity + 
                             `Business Support Policy` + `Labor Policy` +
                             `Law & Order`,
                           data=pci_2021,
                           approach="CV", 
                           kernel="bisquare", 
                           adaptive=FALSE, 
                           longlat=FALSE)
Fixed bandwidth: 968784.2 CV score: 7783319735 
Fixed bandwidth: 598861.3 CV score: 8456546162 
Fixed bandwidth: 1197409 CV score: 6907612208 
Fixed bandwidth: 1338707 CV score: 6394763233 
Fixed bandwidth: 1426034 CV score: 6166490448 
Fixed bandwidth: 1480005 CV score: 6058968306 
Fixed bandwidth: 1513361 CV score: 6003759868 
Fixed bandwidth: 1533976 CV score: 5973544648 
Fixed bandwidth: 1546717 CV score: 5956273318 
Fixed bandwidth: 1554591 CV score: 5946105624 
Fixed bandwidth: 1559458 CV score: 5940013121 
Fixed bandwidth: 1562465 CV score: 5936321848 
Fixed bandwidth: 1564324 CV score: 5934068480 
Fixed bandwidth: 1565473 CV score: 5932686416 
Fixed bandwidth: 1566183 CV score: 5931836280 
Fixed bandwidth: 1566622 CV score: 5931312400 
Fixed bandwidth: 1566893 CV score: 5930989208 
Fixed bandwidth: 1567061 CV score: 5930789688 
Fixed bandwidth: 1567164 CV score: 5930666463 
Fixed bandwidth: 1567228 CV score: 5930590338 
Fixed bandwidth: 1567268 CV score: 5930543303 
Fixed bandwidth: 1567292 CV score: 5930514238 
Fixed bandwidth: 1567308 CV score: 5930496277 
Fixed bandwidth: 1567317 CV score: 5930485177 
Fixed bandwidth: 1567323 CV score: 5930478317 
Fixed bandwidth: 1567326 CV score: 5930474077 
Fixed bandwidth: 1567328 CV score: 5930471457 
Fixed bandwidth: 1567330 CV score: 5930469838 
Fixed bandwidth: 1567331 CV score: 5930468837 
Fixed bandwidth: 1567331 CV score: 5930468219 
Fixed bandwidth: 1567331 CV score: 5930467836 
Fixed bandwidth: 1567332 CV score: 5930467600 
Fixed bandwidth: 1567332 CV score: 5930467454 
Fixed bandwidth: 1567332 CV score: 5930467364 
Fixed bandwidth: 1567332 CV score: 5930467308 
Fixed bandwidth: 1567332 CV score: 5930467274 
Fixed bandwidth: 1567332 CV score: 5930467252 
Fixed bandwidth: 1567332 CV score: 5930467239 
Fixed bandwidth: 1567332 CV score: 5930467231 
Fixed bandwidth: 1567332 CV score: 5930467226 
Fixed bandwidth: 1567332 CV score: 5930467223 
Fixed bandwidth: 1567332 CV score: 5930467221 
Fixed bandwidth: 1567332 CV score: 5930467220 
Fixed bandwidth: 1567332 CV score: 5930467219 
Fixed bandwidth: 1567332 CV score: 5930467219 
Fixed bandwidth: 1567332 CV score: 5930467218 
Fixed bandwidth: 1567332 CV score: 5930467218 
Fixed bandwidth: 1567332 CV score: 5930467218 
Fixed bandwidth: 1567332 CV score: 5930467218 
Fixed bandwidth: 1567332 CV score: 5930467218 
gwr.fixed_capital <- gwr.basic(formula = total_registered_capital ~ 
                                 `Entry Costs` + `Land Access` + Transparency + 
                                 `Time Costs` + `Informal charges` + Proactivity + 
                                 `Business Support Policy` + `Labor Policy` +
                                 `Law & Order`,
                               data=pci_2021,
                               bw=bw.fixed_capital, 
                               kernel = 'bisquare', 
                               longlat = FALSE)

gwr.fixed_capital
   ***********************************************************************
   *                       Package   GWmodel                             *
   ***********************************************************************
   Program starts at: 2024-11-12 17:10:55.437317 
   Call:
   gwr.basic(formula = total_registered_capital ~ `Entry Costs` + 
    `Land Access` + Transparency + `Time Costs` + `Informal charges` + 
    Proactivity + `Business Support Policy` + `Labor Policy` + 
    `Law & Order`, data = pci_2021, bw = bw.fixed_capital, kernel = "bisquare", 
    longlat = FALSE)

   Dependent (y) variable:  total_registered_capital
   Independent variables:  Entry Costs Land Access Transparency Time Costs Informal charges Proactivity Business Support Policy Labor Policy Law & Order
   Number of data points: 66
   ***********************************************************************
   *                    Results of Global Regression                     *
   ***********************************************************************

   Call:
    lm(formula = formula, data = data)

   Residuals:
   Min     1Q Median     3Q    Max 
-14578  -6219  -1215   5880  22546 

   Coefficients:
                             Estimate Std. Error t value Pr(>|t|)    
   (Intercept)                 -34440      26619  -1.294 0.201028    
   `Entry Costs`                -4289       2211  -1.940 0.057402 .  
   `Land Access`                 2175       2772   0.785 0.435874    
   Transparency                   428       2002   0.214 0.831530    
   `Time Costs`                  5400       2082   2.593 0.012103 *  
   `Informal charges`           -3822       2374  -1.610 0.112990    
   Proactivity                  -2636       2414  -1.092 0.279523    
   `Business Support Policy`     5736       1491   3.847 0.000309 ***
   `Labor Policy`                6999       1715   4.081 0.000144 ***
   `Law & Order`                -2987       2720  -1.098 0.276892    

   ---Significance stars
   Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
   Residual standard error: 8718 on 56 degrees of freedom
   Multiple R-squared:  0.58
   Adjusted R-squared: 0.5125 
   F-statistic: 8.591 on 9 and 56 DF,  p-value: 6.006e-08 
   ***Extra Diagnostic information
   Residual sum of squares: 4256608383
   Sigma(hat): 8155.336
   AIC:  1396.117
   AICc:  1401.006
   BIC:  1400.29
   ***********************************************************************
   *          Results of Geographically Weighted Regression              *
   ***********************************************************************

   *********************Model calibration information*********************
   Kernel function: bisquare 
   Fixed bandwidth: 1567332 
   Regression points: the same locations as observations are used.
   Distance metric: Euclidean distance metric is used.

   ****************Summary of GWR coefficient estimates:******************
                                   Min.    1st Qu.     Median    3rd Qu.
   Intercept                 -39656.036 -37709.358 -36511.735 -35259.621
   `Entry Costs`              -7766.261  -6520.588  -4431.855  -2230.021
   `Land Access`                -43.236    692.711   2153.727   3568.962
   Transparency                -696.951   -242.575    385.228    604.899
   `Time Costs`                3700.055   4149.172   5788.037   7345.897
   `Informal charges`         -3779.846  -3549.856  -3434.954  -3209.836
   Proactivity                -4031.731  -3460.458  -2426.955   -890.564
   `Business Support Policy`   3464.883   4153.988   5602.217   5987.741
   `Labor Policy`              6390.413   6490.595   6778.570   7458.271
   `Law & Order`              -4128.646  -3918.244  -3182.281  -2150.035
                                  Max.
   Intercept                 -34262.35
   `Entry Costs`              -1482.41
   `Land Access`               4334.98
   Transparency                 721.93
   `Time Costs`                8232.04
   `Informal charges`         -2711.26
   Proactivity                 -297.88
   `Business Support Policy`   6071.96
   `Labor Policy`              8111.08
   `Law & Order`              -1610.82
   ************************Diagnostic information*************************
   Number of data points: 66 
   Effective number of parameters (2trace(S) - trace(S'S)): 17.98284 
   Effective degrees of freedom (n-2trace(S) + trace(S'S)): 48.01716 
   AICc (GWR book, Fotheringham, et al. 2002, p. 61, eq 2.33): 1406.215 
   AIC (GWR book, Fotheringham, et al. 2002,GWR p. 96, eq. 4.22): 1377.055 
   BIC (GWR book, Fotheringham, et al. 2002,GWR p. 61, eq. 2.34): 1360.196 
   Residual sum of squares: 3523822174 
   R-square value:  0.6522746 
   Adjusted R-square value:  0.5192787 

   ***********************************************************************
   Program stops at: 2024-11-12 17:10:55.461035 
bw.fixed_capital <- bw.gwr(formula = total_registered_capital ~ 
                             `Entry Costs` + `Land Access` + Transparency + 
                             `Time Costs` + `Informal charges` + Proactivity + 
                             `Business Support Policy` + `Labor Policy` +
                             `Law & Order`,
                           data=pci_2021,
                           approach="CV", 
                           kernel="tricube", 
                           adaptive=FALSE, 
                           longlat=FALSE)
Fixed bandwidth: 968784.2 CV score: 7851312052 
Fixed bandwidth: 598861.3 CV score: 8592301374 
Fixed bandwidth: 1197409 CV score: 7030396284 
Fixed bandwidth: 1338707 CV score: 6470426796 
Fixed bandwidth: 1426034 CV score: 6201002827 
Fixed bandwidth: 1480005 CV score: 6070826472 
Fixed bandwidth: 1513361 CV score: 6003773734 
Fixed bandwidth: 1533976 CV score: 5967115877 
Fixed bandwidth: 1546717 CV score: 5946195145 
Fixed bandwidth: 1554591 CV score: 5933903725 
Fixed bandwidth: 1559458 CV score: 5926544984 
Fixed bandwidth: 1562465 CV score: 5922086531 
Fixed bandwidth: 1564324 CV score: 5919364943 
Fixed bandwidth: 1565473 CV score: 5917695782 
Fixed bandwidth: 1566183 CV score: 5916669083 
Fixed bandwidth: 1566622 CV score: 5916036416 
Fixed bandwidth: 1566893 CV score: 5915646118 
Fixed bandwidth: 1567061 CV score: 5915405173 
Fixed bandwidth: 1567164 CV score: 5915256365 
Fixed bandwidth: 1567228 CV score: 5915164435 
Fixed bandwidth: 1567268 CV score: 5915107635 
Fixed bandwidth: 1567292 CV score: 5915072537 
Fixed bandwidth: 1567308 CV score: 5915050847 
Fixed bandwidth: 1567317 CV score: 5915037442 
Fixed bandwidth: 1567323 CV score: 5915029158 
Fixed bandwidth: 1567326 CV score: 5915024039 
Fixed bandwidth: 1567328 CV score: 5915020875 
Fixed bandwidth: 1567330 CV score: 5915018919 
Fixed bandwidth: 1567331 CV score: 5915017711 
Fixed bandwidth: 1567331 CV score: 5915016964 
Fixed bandwidth: 1567331 CV score: 5915016502 
Fixed bandwidth: 1567332 CV score: 5915016217 
Fixed bandwidth: 1567332 CV score: 5915016040 
Fixed bandwidth: 1567332 CV score: 5915015931 
Fixed bandwidth: 1567332 CV score: 5915015864 
Fixed bandwidth: 1567332 CV score: 5915015822 
Fixed bandwidth: 1567332 CV score: 5915015797 
Fixed bandwidth: 1567332 CV score: 5915015781 
Fixed bandwidth: 1567332 CV score: 5915015771 
Fixed bandwidth: 1567332 CV score: 5915015765 
Fixed bandwidth: 1567332 CV score: 5915015761 
Fixed bandwidth: 1567332 CV score: 5915015759 
Fixed bandwidth: 1567332 CV score: 5915015757 
Fixed bandwidth: 1567332 CV score: 5915015757 
Fixed bandwidth: 1567332 CV score: 5915015756 
Fixed bandwidth: 1567332 CV score: 5915015756 
Fixed bandwidth: 1567332 CV score: 5915015755 
Fixed bandwidth: 1567332 CV score: 5915015755 
Fixed bandwidth: 1567332 CV score: 5915015755 
Fixed bandwidth: 1567332 CV score: 5915015755 
gwr.fixed_capital <- gwr.basic(formula = total_registered_capital ~ 
                                 `Entry Costs` + `Land Access` + Transparency + 
                                 `Time Costs` + `Informal charges` + Proactivity + 
                                 `Business Support Policy` + `Labor Policy` +
                                 `Law & Order`,
                               data=pci_2021,
                               bw=bw.fixed_capital, 
                               kernel = 'tricube', 
                               longlat = FALSE)

gwr.fixed_capital
   ***********************************************************************
   *                       Package   GWmodel                             *
   ***********************************************************************
   Program starts at: 2024-11-12 17:10:55.535393 
   Call:
   gwr.basic(formula = total_registered_capital ~ `Entry Costs` + 
    `Land Access` + Transparency + `Time Costs` + `Informal charges` + 
    Proactivity + `Business Support Policy` + `Labor Policy` + 
    `Law & Order`, data = pci_2021, bw = bw.fixed_capital, kernel = "tricube", 
    longlat = FALSE)

   Dependent (y) variable:  total_registered_capital
   Independent variables:  Entry Costs Land Access Transparency Time Costs Informal charges Proactivity Business Support Policy Labor Policy Law & Order
   Number of data points: 66
   ***********************************************************************
   *                    Results of Global Regression                     *
   ***********************************************************************

   Call:
    lm(formula = formula, data = data)

   Residuals:
   Min     1Q Median     3Q    Max 
-14578  -6219  -1215   5880  22546 

   Coefficients:
                             Estimate Std. Error t value Pr(>|t|)    
   (Intercept)                 -34440      26619  -1.294 0.201028    
   `Entry Costs`                -4289       2211  -1.940 0.057402 .  
   `Land Access`                 2175       2772   0.785 0.435874    
   Transparency                   428       2002   0.214 0.831530    
   `Time Costs`                  5400       2082   2.593 0.012103 *  
   `Informal charges`           -3822       2374  -1.610 0.112990    
   Proactivity                  -2636       2414  -1.092 0.279523    
   `Business Support Policy`     5736       1491   3.847 0.000309 ***
   `Labor Policy`                6999       1715   4.081 0.000144 ***
   `Law & Order`                -2987       2720  -1.098 0.276892    

   ---Significance stars
   Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
   Residual standard error: 8718 on 56 degrees of freedom
   Multiple R-squared:  0.58
   Adjusted R-squared: 0.5125 
   F-statistic: 8.591 on 9 and 56 DF,  p-value: 6.006e-08 
   ***Extra Diagnostic information
   Residual sum of squares: 4256608383
   Sigma(hat): 8155.336
   AIC:  1396.117
   AICc:  1401.006
   BIC:  1400.29
   ***********************************************************************
   *          Results of Geographically Weighted Regression              *
   ***********************************************************************

   *********************Model calibration information*********************
   Kernel function: tricube 
   Fixed bandwidth: 1567332 
   Regression points: the same locations as observations are used.
   Distance metric: Euclidean distance metric is used.

   ****************Summary of GWR coefficient estimates:******************
                                  Min.   1st Qu.    Median   3rd Qu.      Max.
   Intercept                 -39514.12 -36872.96 -36089.57 -35272.06 -34471.79
   `Entry Costs`              -7674.22  -6506.37  -4452.43  -2206.36  -1332.01
   `Land Access`               -266.45    615.59   2271.45   3488.82   4159.34
   Transparency                -642.62   -226.32    355.37    650.63    759.50
   `Time Costs`                3708.79   4259.18   5699.30   7308.60   8243.90
   `Informal charges`         -3781.92  -3612.92  -3461.99  -3143.90  -2786.39
   Proactivity                -3836.21  -3310.28  -2468.52   -895.02   -167.04
   `Business Support Policy`   3197.01   4028.81   5689.53   5935.21   5972.56
   `Labor Policy`              6349.52   6406.45   6905.82   7271.20   7946.32
   `Law & Order`              -4139.93  -3903.33  -3201.87  -2109.54  -1407.47
   ************************Diagnostic information*************************
   Number of data points: 66 
   Effective number of parameters (2trace(S) - trace(S'S)): 16.80406 
   Effective degrees of freedom (n-2trace(S) + trace(S'S)): 49.19594 
   AICc (GWR book, Fotheringham, et al. 2002, p. 61, eq 2.33): 1405.871 
   AIC (GWR book, Fotheringham, et al. 2002,GWR p. 96, eq. 4.22): 1378.268 
   BIC (GWR book, Fotheringham, et al. 2002,GWR p. 61, eq. 2.34): 1359.491 
   Residual sum of squares: 3622048375 
   R-square value:  0.6425818 
   Adjusted R-square value:  0.517964 

   ***********************************************************************
   Program stops at: 2024-11-12 17:10:55.559644 
bw.fixed_capital <- bw.gwr(formula = total_registered_capital ~ 
                             `Entry Costs` + `Land Access` + Transparency + 
                             `Time Costs` + `Informal charges` + Proactivity + 
                             `Business Support Policy` + `Labor Policy` +
                             `Law & Order`,
                           data=pci_2021,
                           approach="CV", 
                           kernel="boxcar", 
                           adaptive=FALSE, 
                           longlat=FALSE)
Fixed bandwidth: 968784.2 CV score: 6660001230 
Fixed bandwidth: 598861.3 CV score: 8381045486 
Fixed bandwidth: 1197409 CV score: 5518004098 
Fixed bandwidth: 1338707 CV score: 5643658360 
Fixed bandwidth: 1110082 CV score: 5968840530 
Fixed bandwidth: 1251380 CV score: 5680684732 
Fixed bandwidth: 1164053 CV score: 5557485916 
Fixed bandwidth: 1218024 CV score: 5462660387 
Fixed bandwidth: 1230765 CV score: 5519707752 
Fixed bandwidth: 1210150 CV score: 5429675517 
Fixed bandwidth: 1205283 CV score: 5419111082 
Fixed bandwidth: 1202276 CV score: 5449814745 
Fixed bandwidth: 1207142 CV score: 5380297063 
Fixed bandwidth: 1208291 CV score: 5418884413 
Fixed bandwidth: 1206432 CV score: 5380297063 
gwr.fixed_capital <- gwr.basic(formula = total_registered_capital ~ 
                                 `Entry Costs` + `Land Access` + Transparency + 
                                 `Time Costs` + `Informal charges` + Proactivity + 
                                 `Business Support Policy` + `Labor Policy` +
                                 `Law & Order`,
                               data=pci_2021,
                               bw=bw.fixed_capital, 
                               kernel = 'boxcar', 
                               longlat = FALSE)

gwr.fixed_capital
   ***********************************************************************
   *                       Package   GWmodel                             *
   ***********************************************************************
   Program starts at: 2024-11-12 17:10:55.608437 
   Call:
   gwr.basic(formula = total_registered_capital ~ `Entry Costs` + 
    `Land Access` + Transparency + `Time Costs` + `Informal charges` + 
    Proactivity + `Business Support Policy` + `Labor Policy` + 
    `Law & Order`, data = pci_2021, bw = bw.fixed_capital, kernel = "boxcar", 
    longlat = FALSE)

   Dependent (y) variable:  total_registered_capital
   Independent variables:  Entry Costs Land Access Transparency Time Costs Informal charges Proactivity Business Support Policy Labor Policy Law & Order
   Number of data points: 66
   ***********************************************************************
   *                    Results of Global Regression                     *
   ***********************************************************************

   Call:
    lm(formula = formula, data = data)

   Residuals:
   Min     1Q Median     3Q    Max 
-14578  -6219  -1215   5880  22546 

   Coefficients:
                             Estimate Std. Error t value Pr(>|t|)    
   (Intercept)                 -34440      26619  -1.294 0.201028    
   `Entry Costs`                -4289       2211  -1.940 0.057402 .  
   `Land Access`                 2175       2772   0.785 0.435874    
   Transparency                   428       2002   0.214 0.831530    
   `Time Costs`                  5400       2082   2.593 0.012103 *  
   `Informal charges`           -3822       2374  -1.610 0.112990    
   Proactivity                  -2636       2414  -1.092 0.279523    
   `Business Support Policy`     5736       1491   3.847 0.000309 ***
   `Labor Policy`                6999       1715   4.081 0.000144 ***
   `Law & Order`                -2987       2720  -1.098 0.276892    

   ---Significance stars
   Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
   Residual standard error: 8718 on 56 degrees of freedom
   Multiple R-squared:  0.58
   Adjusted R-squared: 0.5125 
   F-statistic: 8.591 on 9 and 56 DF,  p-value: 6.006e-08 
   ***Extra Diagnostic information
   Residual sum of squares: 4256608383
   Sigma(hat): 8155.336
   AIC:  1396.117
   AICc:  1401.006
   BIC:  1400.29
   ***********************************************************************
   *          Results of Geographically Weighted Regression              *
   ***********************************************************************

   *********************Model calibration information*********************
   Kernel function: boxcar 
   Fixed bandwidth: 1206432 
   Regression points: the same locations as observations are used.
   Distance metric: Euclidean distance metric is used.

   ****************Summary of GWR coefficient estimates:******************
                                  Min.   1st Qu.    Median   3rd Qu.      Max.
   Intercept                 -49497.63 -40646.84 -35430.14 -34440.38 -26309.16
   `Entry Costs`              -7885.56  -6097.37  -4289.41  -3756.00  -1053.55
   `Land Access`              -1126.13   2171.08   2367.77   3283.84   4319.80
   Transparency                -860.39   -115.71    372.09    583.19   2293.65
   `Time Costs`                3176.36   5400.36   5407.11   6233.13   8412.18
   `Informal charges`         -4942.22  -4000.32  -3822.14  -2740.83  -1997.27
   Proactivity                -3572.12  -2636.05  -2309.27  -1876.36    471.37
   `Business Support Policy`   2007.21   5118.09   5732.16   5807.65   6487.12
   `Labor Policy`              5818.28   6604.45   6999.22   7247.19   7593.54
   `Law & Order`              -4321.55  -3956.86  -3181.03  -2963.59   -424.85
   ************************Diagnostic information*************************
   Number of data points: 66 
   Effective number of parameters (2trace(S) - trace(S'S)): 12.03779 
   Effective degrees of freedom (n-2trace(S) + trace(S'S)): 53.96221 
   AICc (GWR book, Fotheringham, et al. 2002, p. 61, eq 2.33): 1399.444 
   AIC (GWR book, Fotheringham, et al. 2002,GWR p. 96, eq. 4.22): 1378.362 
   BIC (GWR book, Fotheringham, et al. 2002,GWR p. 61, eq. 2.34): 1350.758 
   Residual sum of squares: 3782531904 
   R-square value:  0.6267455 
   Adjusted R-square value:  0.5419085 

   ***********************************************************************
   Program stops at: 2024-11-12 17:10:55.631339 
kernel_comparison <- read_xlsx("data/rds/capital_local_fixed_kernel_comparison.xlsx")

ggplot(kernel_comparison, aes(x = reorder(Kernel, `Adjusted R2`), y = `Adjusted R2`)) +
  geom_bar(stat = "identity", fill = "steelblue", color = "black") +
  labs(title = "Comparison of Kernel method",
       x = "Kernel",
       y = "Adjusted R2") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

Note

After evaluating the various approaches and kernel types, we found that there were no differences in performance across the different approaches.

However, when it came to kernel selection, the ‘bisquare’ kernel consistently outperformed the others, achieving the highest Adjusted R². This suggests that, for this particular dataset and modeling context, the ‘boxcar’ kernel offers the most effective fit.

bw.adaptive_capital <- bw.gwr(formula = total_registered_capital ~ 
                                `Entry Costs` + `Land Access` + Transparency + 
                                `Time Costs` + `Informal charges` + Proactivity +
                                `Business Support Policy` + `Labor Policy` +
                                `Law & Order`,
                              data=pci_2021,
                              approach="CV", 
                              kernel="bisquare", 
                              adaptive=TRUE, 
                              longlat=FALSE)
Adaptive bandwidth: 48 CV score: 7552329470 
Adaptive bandwidth: 38 CV score: 8078226145 
Adaptive bandwidth: 56 CV score: 7245214023 
Adaptive bandwidth: 59 CV score: 6996709773 
Adaptive bandwidth: 63 CV score: 6680191144 
Adaptive bandwidth: 63 CV score: 6680191144 
gwr.adaptive_capital <- gwr.basic(formula = total_registered_capital ~ 
                                    `Entry Costs` + `Land Access` + Transparency + 
                                    `Time Costs` + `Informal charges` + Proactivity + 
                                    `Business Support Policy` + `Labor Policy` +
                                    `Law & Order`,
                                  data=pci_2021,
                                  bw=bw.adaptive_capital, 
                                  kernel = 'bisquare',
                                  adaptive=TRUE, 
                                  longlat = FALSE)

gwr.adaptive_capital
   ***********************************************************************
   *                       Package   GWmodel                             *
   ***********************************************************************
   Program starts at: 2024-11-12 17:10:55.854077 
   Call:
   gwr.basic(formula = total_registered_capital ~ `Entry Costs` + 
    `Land Access` + Transparency + `Time Costs` + `Informal charges` + 
    Proactivity + `Business Support Policy` + `Labor Policy` + 
    `Law & Order`, data = pci_2021, bw = bw.adaptive_capital, 
    kernel = "bisquare", adaptive = TRUE, longlat = FALSE)

   Dependent (y) variable:  total_registered_capital
   Independent variables:  Entry Costs Land Access Transparency Time Costs Informal charges Proactivity Business Support Policy Labor Policy Law & Order
   Number of data points: 66
   ***********************************************************************
   *                    Results of Global Regression                     *
   ***********************************************************************

   Call:
    lm(formula = formula, data = data)

   Residuals:
   Min     1Q Median     3Q    Max 
-14578  -6219  -1215   5880  22546 

   Coefficients:
                             Estimate Std. Error t value Pr(>|t|)    
   (Intercept)                 -34440      26619  -1.294 0.201028    
   `Entry Costs`                -4289       2211  -1.940 0.057402 .  
   `Land Access`                 2175       2772   0.785 0.435874    
   Transparency                   428       2002   0.214 0.831530    
   `Time Costs`                  5400       2082   2.593 0.012103 *  
   `Informal charges`           -3822       2374  -1.610 0.112990    
   Proactivity                  -2636       2414  -1.092 0.279523    
   `Business Support Policy`     5736       1491   3.847 0.000309 ***
   `Labor Policy`                6999       1715   4.081 0.000144 ***
   `Law & Order`                -2987       2720  -1.098 0.276892    

   ---Significance stars
   Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
   Residual standard error: 8718 on 56 degrees of freedom
   Multiple R-squared:  0.58
   Adjusted R-squared: 0.5125 
   F-statistic: 8.591 on 9 and 56 DF,  p-value: 6.006e-08 
   ***Extra Diagnostic information
   Residual sum of squares: 4256608383
   Sigma(hat): 8155.336
   AIC:  1396.117
   AICc:  1401.006
   BIC:  1400.29
   ***********************************************************************
   *          Results of Geographically Weighted Regression              *
   ***********************************************************************

   *********************Model calibration information*********************
   Kernel function: bisquare 
   Adaptive bandwidth: 63 (number of nearest neighbours)
   Regression points: the same locations as observations are used.
   Distance metric: Euclidean distance metric is used.

   ****************Summary of GWR coefficient estimates:******************
                                  Min.   1st Qu.    Median   3rd Qu.       Max.
   Intercept                 -41074.10 -39734.01 -37621.80 -36852.31 -32049.862
   `Entry Costs`              -7951.74  -7491.97  -4447.79  -1403.80  -1145.634
   `Land Access`               -329.15   -286.28   2032.35   4158.50   4446.563
   Transparency                -750.65   -616.85    507.46    744.27    959.962
   `Time Costs`                3551.50   3800.16   6762.80   8192.55   8344.559
   `Informal charges`         -3997.21  -3449.04  -3218.23  -2800.56  -2663.778
   Proactivity                -4250.08  -3779.54  -2332.51   -180.23      0.763
   `Business Support Policy`   3210.83   3227.38   4896.51   5840.28   5912.261
   `Labor Policy`              5492.11   6446.88   6512.91   7886.25   8236.279
   `Law & Order`              -4258.17  -4181.16  -3330.83  -1394.81  -1358.757
   ************************Diagnostic information*************************
   Number of data points: 66 
   Effective number of parameters (2trace(S) - trace(S'S)): 22.038 
   Effective degrees of freedom (n-2trace(S) + trace(S'S)): 43.962 
   AICc (GWR book, Fotheringham, et al. 2002, p. 61, eq 2.33): 1413.956 
   AIC (GWR book, Fotheringham, et al. 2002,GWR p. 96, eq. 4.22): 1374.151 
   BIC (GWR book, Fotheringham, et al. 2002,GWR p. 61, eq. 2.34): 1368.891 
   Residual sum of squares: 3191366879 
   R-square value:  0.6850808 
   Adjusted R-square value:  0.5235383 

   ***********************************************************************
   Program stops at: 2024-11-12 17:10:55.878488 
bw.adaptive_capital <- bw.gwr(formula = total_registered_capital ~ 
                                `Entry Costs` + `Land Access` + Transparency + 
                                `Time Costs` + `Informal charges` + Proactivity +
                                `Business Support Policy` + `Labor Policy` +
                                `Law & Order`,
                              data=pci_2021,
                              approach="AIC", 
                              kernel="bisquare", 
                              adaptive=TRUE, 
                              longlat=FALSE)
Adaptive bandwidth (number of nearest neighbours): 48 AICc value: 1421.962 
Adaptive bandwidth (number of nearest neighbours): 38 AICc value: 1429.354 
Adaptive bandwidth (number of nearest neighbours): 56 AICc value: 1418.48 
Adaptive bandwidth (number of nearest neighbours): 59 AICc value: 1417.057 
Adaptive bandwidth (number of nearest neighbours): 63 AICc value: 1413.956 
Adaptive bandwidth (number of nearest neighbours): 63 AICc value: 1413.956 
gwr.adaptive_capital <- gwr.basic(formula = total_registered_capital ~ 
                                    `Entry Costs` + `Land Access` + Transparency + 
                                    `Time Costs` + `Informal charges` + Proactivity + 
                                    `Business Support Policy` + `Labor Policy` +
                                    `Law & Order`,
                                  data=pci_2021,
                                  bw=bw.adaptive_capital, 
                                  kernel = 'bisquare',
                                  adaptive=TRUE, 
                                  longlat = FALSE)

gwr.adaptive_capital
   ***********************************************************************
   *                       Package   GWmodel                             *
   ***********************************************************************
   Program starts at: 2024-11-12 17:10:55.927806 
   Call:
   gwr.basic(formula = total_registered_capital ~ `Entry Costs` + 
    `Land Access` + Transparency + `Time Costs` + `Informal charges` + 
    Proactivity + `Business Support Policy` + `Labor Policy` + 
    `Law & Order`, data = pci_2021, bw = bw.adaptive_capital, 
    kernel = "bisquare", adaptive = TRUE, longlat = FALSE)

   Dependent (y) variable:  total_registered_capital
   Independent variables:  Entry Costs Land Access Transparency Time Costs Informal charges Proactivity Business Support Policy Labor Policy Law & Order
   Number of data points: 66
   ***********************************************************************
   *                    Results of Global Regression                     *
   ***********************************************************************

   Call:
    lm(formula = formula, data = data)

   Residuals:
   Min     1Q Median     3Q    Max 
-14578  -6219  -1215   5880  22546 

   Coefficients:
                             Estimate Std. Error t value Pr(>|t|)    
   (Intercept)                 -34440      26619  -1.294 0.201028    
   `Entry Costs`                -4289       2211  -1.940 0.057402 .  
   `Land Access`                 2175       2772   0.785 0.435874    
   Transparency                   428       2002   0.214 0.831530    
   `Time Costs`                  5400       2082   2.593 0.012103 *  
   `Informal charges`           -3822       2374  -1.610 0.112990    
   Proactivity                  -2636       2414  -1.092 0.279523    
   `Business Support Policy`     5736       1491   3.847 0.000309 ***
   `Labor Policy`                6999       1715   4.081 0.000144 ***
   `Law & Order`                -2987       2720  -1.098 0.276892    

   ---Significance stars
   Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
   Residual standard error: 8718 on 56 degrees of freedom
   Multiple R-squared:  0.58
   Adjusted R-squared: 0.5125 
   F-statistic: 8.591 on 9 and 56 DF,  p-value: 6.006e-08 
   ***Extra Diagnostic information
   Residual sum of squares: 4256608383
   Sigma(hat): 8155.336
   AIC:  1396.117
   AICc:  1401.006
   BIC:  1400.29
   ***********************************************************************
   *          Results of Geographically Weighted Regression              *
   ***********************************************************************

   *********************Model calibration information*********************
   Kernel function: bisquare 
   Adaptive bandwidth: 63 (number of nearest neighbours)
   Regression points: the same locations as observations are used.
   Distance metric: Euclidean distance metric is used.

   ****************Summary of GWR coefficient estimates:******************
                                  Min.   1st Qu.    Median   3rd Qu.       Max.
   Intercept                 -41074.10 -39734.01 -37621.80 -36852.31 -32049.862
   `Entry Costs`              -7951.74  -7491.97  -4447.79  -1403.80  -1145.634
   `Land Access`               -329.15   -286.28   2032.35   4158.50   4446.563
   Transparency                -750.65   -616.85    507.46    744.27    959.962
   `Time Costs`                3551.50   3800.16   6762.80   8192.55   8344.559
   `Informal charges`         -3997.21  -3449.04  -3218.23  -2800.56  -2663.778
   Proactivity                -4250.08  -3779.54  -2332.51   -180.23      0.763
   `Business Support Policy`   3210.83   3227.38   4896.51   5840.28   5912.261
   `Labor Policy`              5492.11   6446.88   6512.91   7886.25   8236.279
   `Law & Order`              -4258.17  -4181.16  -3330.83  -1394.81  -1358.757
   ************************Diagnostic information*************************
   Number of data points: 66 
   Effective number of parameters (2trace(S) - trace(S'S)): 22.038 
   Effective degrees of freedom (n-2trace(S) + trace(S'S)): 43.962 
   AICc (GWR book, Fotheringham, et al. 2002, p. 61, eq 2.33): 1413.956 
   AIC (GWR book, Fotheringham, et al. 2002,GWR p. 96, eq. 4.22): 1374.151 
   BIC (GWR book, Fotheringham, et al. 2002,GWR p. 61, eq. 2.34): 1368.891 
   Residual sum of squares: 3191366879 
   R-square value:  0.6850808 
   Adjusted R-square value:  0.5235383 

   ***********************************************************************
   Program stops at: 2024-11-12 17:10:55.952839 
approach_comparison <- read_xlsx("data/rds/capital_local_adaptive_approach_comparison.xlsx")

ggplot(approach_comparison, aes(x = Approach, y = `Adjusted R2`)) +
  geom_bar(stat = "identity", fill = "steelblue", color = "black") +
  labs(title = "Comparison of Approaches",
       x = "Approach",
       y = "Adjusted R2") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

bw.adaptive_capital <- bw.gwr(formula = total_registered_capital ~ 
                                `Entry Costs` + `Land Access` + Transparency + 
                                `Time Costs` + `Informal charges` + Proactivity +
                                `Business Support Policy` + `Labor Policy` +
                                `Law & Order`,
                              data=pci_2021,
                              approach="CV", 
                              kernel="gaussian", 
                              adaptive=TRUE, 
                              longlat=FALSE)
Adaptive bandwidth: 48 CV score: 5781051227 
Adaptive bandwidth: 38 CV score: 5988983915 
Adaptive bandwidth: 56 CV score: 5766642473 
Adaptive bandwidth: 59 CV score: 5757831428 
Adaptive bandwidth: 63 CV score: 5756556233 
Adaptive bandwidth: 63 CV score: 5756556233 
gwr.adaptive_capital <- gwr.basic(formula = total_registered_capital ~ 
                                    `Entry Costs` + `Land Access` + Transparency + 
                                    `Time Costs` + `Informal charges` + Proactivity + 
                                    `Business Support Policy` + `Labor Policy` +
                                    `Law & Order`,
                                  data=pci_2021,
                                  bw=bw.adaptive_capital, 
                                  kernel = 'gaussian',
                                  adaptive=TRUE, 
                                  longlat = FALSE)

gwr.adaptive_capital
   ***********************************************************************
   *                       Package   GWmodel                             *
   ***********************************************************************
   Program starts at: 2024-11-12 17:10:56.194223 
   Call:
   gwr.basic(formula = total_registered_capital ~ `Entry Costs` + 
    `Land Access` + Transparency + `Time Costs` + `Informal charges` + 
    Proactivity + `Business Support Policy` + `Labor Policy` + 
    `Law & Order`, data = pci_2021, bw = bw.adaptive_capital, 
    kernel = "gaussian", adaptive = TRUE, longlat = FALSE)

   Dependent (y) variable:  total_registered_capital
   Independent variables:  Entry Costs Land Access Transparency Time Costs Informal charges Proactivity Business Support Policy Labor Policy Law & Order
   Number of data points: 66
   ***********************************************************************
   *                    Results of Global Regression                     *
   ***********************************************************************

   Call:
    lm(formula = formula, data = data)

   Residuals:
   Min     1Q Median     3Q    Max 
-14578  -6219  -1215   5880  22546 

   Coefficients:
                             Estimate Std. Error t value Pr(>|t|)    
   (Intercept)                 -34440      26619  -1.294 0.201028    
   `Entry Costs`                -4289       2211  -1.940 0.057402 .  
   `Land Access`                 2175       2772   0.785 0.435874    
   Transparency                   428       2002   0.214 0.831530    
   `Time Costs`                  5400       2082   2.593 0.012103 *  
   `Informal charges`           -3822       2374  -1.610 0.112990    
   Proactivity                  -2636       2414  -1.092 0.279523    
   `Business Support Policy`     5736       1491   3.847 0.000309 ***
   `Labor Policy`                6999       1715   4.081 0.000144 ***
   `Law & Order`                -2987       2720  -1.098 0.276892    

   ---Significance stars
   Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
   Residual standard error: 8718 on 56 degrees of freedom
   Multiple R-squared:  0.58
   Adjusted R-squared: 0.5125 
   F-statistic: 8.591 on 9 and 56 DF,  p-value: 6.006e-08 
   ***Extra Diagnostic information
   Residual sum of squares: 4256608383
   Sigma(hat): 8155.336
   AIC:  1396.117
   AICc:  1401.006
   BIC:  1400.29
   ***********************************************************************
   *          Results of Geographically Weighted Regression              *
   ***********************************************************************

   *********************Model calibration information*********************
   Kernel function: gaussian 
   Adaptive bandwidth: 63 (number of nearest neighbours)
   Regression points: the same locations as observations are used.
   Distance metric: Euclidean distance metric is used.

   ****************Summary of GWR coefficient estimates:******************
                                  Min.   1st Qu.    Median   3rd Qu.      Max.
   Intercept                 -35660.74 -35518.48 -34348.13 -33817.53 -33740.09
   `Entry Costs`              -4890.15  -4846.01  -4367.89  -3683.92  -3671.03
   `Land Access`               1789.03   1808.60   2131.00   2492.05   2541.94
   Transparency                 291.52    302.03    410.81    507.96    515.21
   `Time Costs`                4957.06   4986.02   5648.46   5907.74   5915.26
   `Informal charges`         -3823.47  -3811.13  -3796.70  -3723.32  -3701.75
   Proactivity                -2971.57  -2908.43  -2489.94  -2145.23  -2132.66
   `Business Support Policy`   5388.03   5393.41   5640.50   5897.94   5933.73
   `Labor Policy`              6739.85   6801.12   6844.28   7079.92   7110.14
   `Law & Order`              -3245.81  -3223.24  -3099.11  -2814.38  -2798.71
   ************************Diagnostic information*************************
   Number of data points: 66 
   Effective number of parameters (2trace(S) - trace(S'S)): 12.90588 
   Effective degrees of freedom (n-2trace(S) + trace(S'S)): 53.09412 
   AICc (GWR book, Fotheringham, et al. 2002, p. 61, eq 2.33): 1401.387 
   AIC (GWR book, Fotheringham, et al. 2002,GWR p. 96, eq. 4.22): 1381.304 
   BIC (GWR book, Fotheringham, et al. 2002,GWR p. 61, eq. 2.34): 1352.218 
   Residual sum of squares: 3982880708 
   R-square value:  0.6069754 
   Adjusted R-square value:  0.5096069 

   ***********************************************************************
   Program stops at: 2024-11-12 17:10:56.217399 
bw.adaptive_capital <- bw.gwr(formula = total_registered_capital ~ 
                                `Entry Costs` + `Land Access` + Transparency + 
                                `Time Costs` + `Informal charges` + Proactivity +
                                `Business Support Policy` + `Labor Policy` +
                                `Law & Order`,
                              data=pci_2021,
                              approach="CV", 
                              kernel="exponential", 
                              adaptive=TRUE, 
                              longlat=FALSE)
Adaptive bandwidth: 48 CV score: 5766546622 
Adaptive bandwidth: 38 CV score: 5897290227 
Adaptive bandwidth: 56 CV score: 5749422410 
Adaptive bandwidth: 59 CV score: 5740660365 
Adaptive bandwidth: 63 CV score: 5734925534 
Adaptive bandwidth: 63 CV score: 5734925534 
gwr.adaptive_capital <- gwr.basic(formula = total_registered_capital ~ 
                                    `Entry Costs` + `Land Access` + Transparency + 
                                    `Time Costs` + `Informal charges` + Proactivity + 
                                    `Business Support Policy` + `Labor Policy` +
                                    `Law & Order`,
                                  data=pci_2021,
                                  bw=bw.adaptive_capital, 
                                  kernel = 'exponential',
                                  adaptive=TRUE, 
                                  longlat = FALSE)

gwr.adaptive_capital
   ***********************************************************************
   *                       Package   GWmodel                             *
   ***********************************************************************
   Program starts at: 2024-11-12 17:10:56.266476 
   Call:
   gwr.basic(formula = total_registered_capital ~ `Entry Costs` + 
    `Land Access` + Transparency + `Time Costs` + `Informal charges` + 
    Proactivity + `Business Support Policy` + `Labor Policy` + 
    `Law & Order`, data = pci_2021, bw = bw.adaptive_capital, 
    kernel = "exponential", adaptive = TRUE, longlat = FALSE)

   Dependent (y) variable:  total_registered_capital
   Independent variables:  Entry Costs Land Access Transparency Time Costs Informal charges Proactivity Business Support Policy Labor Policy Law & Order
   Number of data points: 66
   ***********************************************************************
   *                    Results of Global Regression                     *
   ***********************************************************************

   Call:
    lm(formula = formula, data = data)

   Residuals:
   Min     1Q Median     3Q    Max 
-14578  -6219  -1215   5880  22546 

   Coefficients:
                             Estimate Std. Error t value Pr(>|t|)    
   (Intercept)                 -34440      26619  -1.294 0.201028    
   `Entry Costs`                -4289       2211  -1.940 0.057402 .  
   `Land Access`                 2175       2772   0.785 0.435874    
   Transparency                   428       2002   0.214 0.831530    
   `Time Costs`                  5400       2082   2.593 0.012103 *  
   `Informal charges`           -3822       2374  -1.610 0.112990    
   Proactivity                  -2636       2414  -1.092 0.279523    
   `Business Support Policy`     5736       1491   3.847 0.000309 ***
   `Labor Policy`                6999       1715   4.081 0.000144 ***
   `Law & Order`                -2987       2720  -1.098 0.276892    

   ---Significance stars
   Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
   Residual standard error: 8718 on 56 degrees of freedom
   Multiple R-squared:  0.58
   Adjusted R-squared: 0.5125 
   F-statistic: 8.591 on 9 and 56 DF,  p-value: 6.006e-08 
   ***Extra Diagnostic information
   Residual sum of squares: 4256608383
   Sigma(hat): 8155.336
   AIC:  1396.117
   AICc:  1401.006
   BIC:  1400.29
   ***********************************************************************
   *          Results of Geographically Weighted Regression              *
   ***********************************************************************

   *********************Model calibration information*********************
   Kernel function: exponential 
   Adaptive bandwidth: 63 (number of nearest neighbours)
   Regression points: the same locations as observations are used.
   Distance metric: Euclidean distance metric is used.

   ****************Summary of GWR coefficient estimates:******************
                                  Min.   1st Qu.    Median   3rd Qu.      Max.
   Intercept                 -38500.97 -37423.22 -34627.96 -32871.50 -30183.06
   `Entry Costs`              -5503.37  -5383.74  -4357.40  -3270.49  -3139.14
   `Land Access`               1475.59   1579.73   1810.70   2883.06   3026.10
   Transparency                 100.85    163.55    442.98    528.46    578.85
   `Time Costs`                4446.05   4523.66   5774.52   6347.06   6529.21
   `Informal charges`         -3940.54  -3842.90  -3797.40  -3546.99  -3471.25
   Proactivity                -3517.35  -3439.56  -2410.27  -1767.44  -1626.42
   `Business Support Policy`   5133.02   5212.05   5466.88   6124.98   6186.44
   `Labor Policy`              6295.05   6783.06   6862.31   7338.52   7434.98
   `Law & Order`              -3544.41  -3419.80  -3057.10  -2665.92  -2584.70
   ************************Diagnostic information*************************
   Number of data points: 66 
   Effective number of parameters (2trace(S) - trace(S'S)): 18.04487 
   Effective degrees of freedom (n-2trace(S) + trace(S'S)): 47.95513 
   AICc (GWR book, Fotheringham, et al. 2002, p. 61, eq 2.33): 1403.172 
   AIC (GWR book, Fotheringham, et al. 2002,GWR p. 96, eq. 4.22): 1376.248 
   BIC (GWR book, Fotheringham, et al. 2002,GWR p. 61, eq. 2.34): 1356.613 
   Residual sum of squares: 3527210782 
   R-square value:  0.6519402 
   Adjusted R-square value:  0.5181808 

   ***********************************************************************
   Program stops at: 2024-11-12 17:10:56.291576 
bw.adaptive_capital <- bw.gwr(formula = total_registered_capital ~ 
                                `Entry Costs` + `Land Access` + Transparency + 
                                `Time Costs` + `Informal charges` + Proactivity +
                                `Business Support Policy` + `Labor Policy` +
                                `Law & Order`,
                              data=pci_2021,
                              approach="CV", 
                              kernel="bisquare", 
                              adaptive=TRUE, 
                              longlat=FALSE)
Adaptive bandwidth: 48 CV score: 7552329470 
Adaptive bandwidth: 38 CV score: 8078226145 
Adaptive bandwidth: 56 CV score: 7245214023 
Adaptive bandwidth: 59 CV score: 6996709773 
Adaptive bandwidth: 63 CV score: 6680191144 
Adaptive bandwidth: 63 CV score: 6680191144 
gwr.adaptive_capital <- gwr.basic(formula = total_registered_capital ~ 
                                    `Entry Costs` + `Land Access` + Transparency + 
                                    `Time Costs` + `Informal charges` + Proactivity + 
                                    `Business Support Policy` + `Labor Policy` +
                                    `Law & Order`,
                                  data=pci_2021,
                                  bw=bw.adaptive_capital, 
                                  kernel = 'bisquare',
                                  adaptive=TRUE, 
                                  longlat = FALSE)

gwr.adaptive_capital
   ***********************************************************************
   *                       Package   GWmodel                             *
   ***********************************************************************
   Program starts at: 2024-11-12 17:10:56.338907 
   Call:
   gwr.basic(formula = total_registered_capital ~ `Entry Costs` + 
    `Land Access` + Transparency + `Time Costs` + `Informal charges` + 
    Proactivity + `Business Support Policy` + `Labor Policy` + 
    `Law & Order`, data = pci_2021, bw = bw.adaptive_capital, 
    kernel = "bisquare", adaptive = TRUE, longlat = FALSE)

   Dependent (y) variable:  total_registered_capital
   Independent variables:  Entry Costs Land Access Transparency Time Costs Informal charges Proactivity Business Support Policy Labor Policy Law & Order
   Number of data points: 66
   ***********************************************************************
   *                    Results of Global Regression                     *
   ***********************************************************************

   Call:
    lm(formula = formula, data = data)

   Residuals:
   Min     1Q Median     3Q    Max 
-14578  -6219  -1215   5880  22546 

   Coefficients:
                             Estimate Std. Error t value Pr(>|t|)    
   (Intercept)                 -34440      26619  -1.294 0.201028    
   `Entry Costs`                -4289       2211  -1.940 0.057402 .  
   `Land Access`                 2175       2772   0.785 0.435874    
   Transparency                   428       2002   0.214 0.831530    
   `Time Costs`                  5400       2082   2.593 0.012103 *  
   `Informal charges`           -3822       2374  -1.610 0.112990    
   Proactivity                  -2636       2414  -1.092 0.279523    
   `Business Support Policy`     5736       1491   3.847 0.000309 ***
   `Labor Policy`                6999       1715   4.081 0.000144 ***
   `Law & Order`                -2987       2720  -1.098 0.276892    

   ---Significance stars
   Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
   Residual standard error: 8718 on 56 degrees of freedom
   Multiple R-squared:  0.58
   Adjusted R-squared: 0.5125 
   F-statistic: 8.591 on 9 and 56 DF,  p-value: 6.006e-08 
   ***Extra Diagnostic information
   Residual sum of squares: 4256608383
   Sigma(hat): 8155.336
   AIC:  1396.117
   AICc:  1401.006
   BIC:  1400.29
   ***********************************************************************
   *          Results of Geographically Weighted Regression              *
   ***********************************************************************

   *********************Model calibration information*********************
   Kernel function: bisquare 
   Adaptive bandwidth: 63 (number of nearest neighbours)
   Regression points: the same locations as observations are used.
   Distance metric: Euclidean distance metric is used.

   ****************Summary of GWR coefficient estimates:******************
                                  Min.   1st Qu.    Median   3rd Qu.       Max.
   Intercept                 -41074.10 -39734.01 -37621.80 -36852.31 -32049.862
   `Entry Costs`              -7951.74  -7491.97  -4447.79  -1403.80  -1145.634
   `Land Access`               -329.15   -286.28   2032.35   4158.50   4446.563
   Transparency                -750.65   -616.85    507.46    744.27    959.962
   `Time Costs`                3551.50   3800.16   6762.80   8192.55   8344.559
   `Informal charges`         -3997.21  -3449.04  -3218.23  -2800.56  -2663.778
   Proactivity                -4250.08  -3779.54  -2332.51   -180.23      0.763
   `Business Support Policy`   3210.83   3227.38   4896.51   5840.28   5912.261
   `Labor Policy`              5492.11   6446.88   6512.91   7886.25   8236.279
   `Law & Order`              -4258.17  -4181.16  -3330.83  -1394.81  -1358.757
   ************************Diagnostic information*************************
   Number of data points: 66 
   Effective number of parameters (2trace(S) - trace(S'S)): 22.038 
   Effective degrees of freedom (n-2trace(S) + trace(S'S)): 43.962 
   AICc (GWR book, Fotheringham, et al. 2002, p. 61, eq 2.33): 1413.956 
   AIC (GWR book, Fotheringham, et al. 2002,GWR p. 96, eq. 4.22): 1374.151 
   BIC (GWR book, Fotheringham, et al. 2002,GWR p. 61, eq. 2.34): 1368.891 
   Residual sum of squares: 3191366879 
   R-square value:  0.6850808 
   Adjusted R-square value:  0.5235383 

   ***********************************************************************
   Program stops at: 2024-11-12 17:10:56.36347 
bw.adaptive_capital <- bw.gwr(formula = total_registered_capital ~ 
                                `Entry Costs` + `Land Access` + Transparency + 
                                `Time Costs` + `Informal charges` + Proactivity +
                                `Business Support Policy` + `Labor Policy` +
                                `Law & Order`,
                              data=pci_2021,
                              approach="CV", 
                              kernel="tricube", 
                              adaptive=TRUE, 
                              longlat=FALSE)
Adaptive bandwidth: 48 CV score: 7596304334 
Adaptive bandwidth: 38 CV score: 8196863184 
Adaptive bandwidth: 56 CV score: 7375514751 
Adaptive bandwidth: 59 CV score: 7138573616 
Adaptive bandwidth: 63 CV score: 6811273447 
Adaptive bandwidth: 63 CV score: 6811273447 
gwr.adaptive_capital <- gwr.basic(formula = total_registered_capital ~ 
                                    `Entry Costs` + `Land Access` + Transparency + 
                                    `Time Costs` + `Informal charges` + Proactivity + 
                                    `Business Support Policy` + `Labor Policy` +
                                    `Law & Order`,
                                  data=pci_2021,
                                  bw=bw.adaptive_capital, 
                                  kernel = 'tricube',
                                  adaptive=TRUE, 
                                  longlat = FALSE)

gwr.adaptive_capital
   ***********************************************************************
   *                       Package   GWmodel                             *
   ***********************************************************************
   Program starts at: 2024-11-12 17:10:56.411491 
   Call:
   gwr.basic(formula = total_registered_capital ~ `Entry Costs` + 
    `Land Access` + Transparency + `Time Costs` + `Informal charges` + 
    Proactivity + `Business Support Policy` + `Labor Policy` + 
    `Law & Order`, data = pci_2021, bw = bw.adaptive_capital, 
    kernel = "tricube", adaptive = TRUE, longlat = FALSE)

   Dependent (y) variable:  total_registered_capital
   Independent variables:  Entry Costs Land Access Transparency Time Costs Informal charges Proactivity Business Support Policy Labor Policy Law & Order
   Number of data points: 66
   ***********************************************************************
   *                    Results of Global Regression                     *
   ***********************************************************************

   Call:
    lm(formula = formula, data = data)

   Residuals:
   Min     1Q Median     3Q    Max 
-14578  -6219  -1215   5880  22546 

   Coefficients:
                             Estimate Std. Error t value Pr(>|t|)    
   (Intercept)                 -34440      26619  -1.294 0.201028    
   `Entry Costs`                -4289       2211  -1.940 0.057402 .  
   `Land Access`                 2175       2772   0.785 0.435874    
   Transparency                   428       2002   0.214 0.831530    
   `Time Costs`                  5400       2082   2.593 0.012103 *  
   `Informal charges`           -3822       2374  -1.610 0.112990    
   Proactivity                  -2636       2414  -1.092 0.279523    
   `Business Support Policy`     5736       1491   3.847 0.000309 ***
   `Labor Policy`                6999       1715   4.081 0.000144 ***
   `Law & Order`                -2987       2720  -1.098 0.276892    

   ---Significance stars
   Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
   Residual standard error: 8718 on 56 degrees of freedom
   Multiple R-squared:  0.58
   Adjusted R-squared: 0.5125 
   F-statistic: 8.591 on 9 and 56 DF,  p-value: 6.006e-08 
   ***Extra Diagnostic information
   Residual sum of squares: 4256608383
   Sigma(hat): 8155.336
   AIC:  1396.117
   AICc:  1401.006
   BIC:  1400.29
   ***********************************************************************
   *          Results of Geographically Weighted Regression              *
   ***********************************************************************

   *********************Model calibration information*********************
   Kernel function: tricube 
   Adaptive bandwidth: 63 (number of nearest neighbours)
   Regression points: the same locations as observations are used.
   Distance metric: Euclidean distance metric is used.

   ****************Summary of GWR coefficient estimates:******************
                                   Min.    1st Qu.     Median    3rd Qu.
   Intercept                 -40890.810 -38925.387 -36189.304 -34853.822
   `Entry Costs`              -7815.938  -7453.635  -4527.630  -1231.472
   `Land Access`               -545.444   -508.720   2220.166   3926.417
   Transparency                -714.231   -527.261    446.917    758.266
   `Time Costs`                3553.935   3839.387   7001.378   8133.690
   `Informal charges`         -4005.748  -3502.270  -3287.430  -2877.722
   Proactivity                -4042.882  -3731.851  -2304.506    -77.475
   `Business Support Policy`   2896.952   2960.567   4891.863   5738.085
   `Labor Policy`              5653.386   6365.076   6403.458   7621.979
   `Law & Order`              -4182.778  -4099.373  -3407.830  -1208.205
                                   Max.
   Intercept                 -33506.221
   `Entry Costs`              -1012.735
   `Land Access`               4284.693
   Transparency                1080.165
   `Time Costs`                8397.056
   `Informal charges`         -2754.215
   Proactivity                   93.391
   `Business Support Policy`   5762.763
   `Labor Policy`              8161.256
   `Law & Order`              -1163.962
   ************************Diagnostic information*************************
   Number of data points: 66 
   Effective number of parameters (2trace(S) - trace(S'S)): 20.6876 
   Effective degrees of freedom (n-2trace(S) + trace(S'S)): 45.3124 
   AICc (GWR book, Fotheringham, et al. 2002, p. 61, eq 2.33): 1414.214 
   AIC (GWR book, Fotheringham, et al. 2002,GWR p. 96, eq. 4.22): 1376.248 
   BIC (GWR book, Fotheringham, et al. 2002,GWR p. 61, eq. 2.34): 1369.15 
   Residual sum of squares: 3323315458 
   R-square value:  0.6720603 
   Adjusted R-square value:  0.518959 

   ***********************************************************************
   Program stops at: 2024-11-12 17:10:56.437063 
bw.adaptive_capital <- bw.gwr(formula = total_registered_capital ~ 
                                `Entry Costs` + `Land Access` + Transparency + 
                                `Time Costs` + `Informal charges` + Proactivity +
                                `Business Support Policy` + `Labor Policy` +
                                `Law & Order`,
                              data=pci_2021,
                              approach="CV", 
                              kernel="boxcar", 
                              adaptive=TRUE, 
                              longlat=FALSE)
Adaptive bandwidth: 48 CV score: 6070224350 
Adaptive bandwidth: 38 CV score: 7936804726 
Adaptive bandwidth: 56 CV score: 5896467685 
Adaptive bandwidth: 59 CV score: 5706822467 
Adaptive bandwidth: 63 CV score: 5619504965 
Adaptive bandwidth: 63 CV score: 5619504965 
gwr.adaptive_capital <- gwr.basic(formula = total_registered_capital ~ 
                                    `Entry Costs` + `Land Access` + Transparency + 
                                    `Time Costs` + `Informal charges` + Proactivity + 
                                    `Business Support Policy` + `Labor Policy` +
                                    `Law & Order`,
                                  data=pci_2021,
                                  bw=bw.adaptive_capital, 
                                  kernel = 'boxcar',
                                  adaptive=TRUE, 
                                  longlat = FALSE)

gwr.adaptive_capital
   ***********************************************************************
   *                       Package   GWmodel                             *
   ***********************************************************************
   Program starts at: 2024-11-12 17:10:56.482853 
   Call:
   gwr.basic(formula = total_registered_capital ~ `Entry Costs` + 
    `Land Access` + Transparency + `Time Costs` + `Informal charges` + 
    Proactivity + `Business Support Policy` + `Labor Policy` + 
    `Law & Order`, data = pci_2021, bw = bw.adaptive_capital, 
    kernel = "boxcar", adaptive = TRUE, longlat = FALSE)

   Dependent (y) variable:  total_registered_capital
   Independent variables:  Entry Costs Land Access Transparency Time Costs Informal charges Proactivity Business Support Policy Labor Policy Law & Order
   Number of data points: 66
   ***********************************************************************
   *                    Results of Global Regression                     *
   ***********************************************************************

   Call:
    lm(formula = formula, data = data)

   Residuals:
   Min     1Q Median     3Q    Max 
-14578  -6219  -1215   5880  22546 

   Coefficients:
                             Estimate Std. Error t value Pr(>|t|)    
   (Intercept)                 -34440      26619  -1.294 0.201028    
   `Entry Costs`                -4289       2211  -1.940 0.057402 .  
   `Land Access`                 2175       2772   0.785 0.435874    
   Transparency                   428       2002   0.214 0.831530    
   `Time Costs`                  5400       2082   2.593 0.012103 *  
   `Informal charges`           -3822       2374  -1.610 0.112990    
   Proactivity                  -2636       2414  -1.092 0.279523    
   `Business Support Policy`     5736       1491   3.847 0.000309 ***
   `Labor Policy`                6999       1715   4.081 0.000144 ***
   `Law & Order`                -2987       2720  -1.098 0.276892    

   ---Significance stars
   Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
   Residual standard error: 8718 on 56 degrees of freedom
   Multiple R-squared:  0.58
   Adjusted R-squared: 0.5125 
   F-statistic: 8.591 on 9 and 56 DF,  p-value: 6.006e-08 
   ***Extra Diagnostic information
   Residual sum of squares: 4256608383
   Sigma(hat): 8155.336
   AIC:  1396.117
   AICc:  1401.006
   BIC:  1400.29
   ***********************************************************************
   *          Results of Geographically Weighted Regression              *
   ***********************************************************************

   *********************Model calibration information*********************
   Kernel function: boxcar 
   Adaptive bandwidth: 63 (number of nearest neighbours)
   Regression points: the same locations as observations are used.
   Distance metric: Euclidean distance metric is used.

   ****************Summary of GWR coefficient estimates:******************
                                  Min.   1st Qu.    Median   3rd Qu.      Max.
   Intercept                 -45108.20 -45108.20 -34156.68 -32649.02 -32649.02
   `Entry Costs`              -4766.07  -4766.07  -4192.18  -3639.63  -3639.63
   `Land Access`               2242.83   2466.78   2466.78   3138.82   3138.82
   Transparency                 204.06    300.10    300.10    497.16    497.16
   `Time Costs`                5119.62   5366.96   5366.96   5507.53   5714.70
   `Informal charges`         -3980.65  -3911.89  -3906.83  -2938.53  -2938.53
   Proactivity                -2665.85  -2582.49  -2316.37  -2272.94  -2272.94
   `Business Support Policy`   5118.09   5118.09   5769.08   5991.51   5991.51
   `Labor Policy`              6604.06   6604.06   7226.85   7293.70   7590.85
   `Law & Order`              -3662.37  -3605.30  -3398.09  -3388.80  -3134.99
   ************************Diagnostic information*************************
   Number of data points: 66 
   Effective number of parameters (2trace(S) - trace(S'S)): 10.32786 
   Effective degrees of freedom (n-2trace(S) + trace(S'S)): 55.67214 
   AICc (GWR book, Fotheringham, et al. 2002, p. 61, eq 2.33): 1399.02 
   AIC (GWR book, Fotheringham, et al. 2002,GWR p. 96, eq. 4.22): 1381.489 
   BIC (GWR book, Fotheringham, et al. 2002,GWR p. 61, eq. 2.34): 1348.431 
   Residual sum of squares: 4070123556 
   R-square value:  0.5983664 
   Adjusted R-square value:  0.5224957 

   ***********************************************************************
   Program stops at: 2024-11-12 17:10:56.505639 
kernel_comparison <- read_xlsx("data/rds/capital_local_adaptive_kernel_comparison.xlsx")

ggplot(kernel_comparison, aes(x = reorder(Kernel, `Adjusted R2`), y = `Adjusted R2`)) +
  geom_bar(stat = "identity", fill = "steelblue", color = "black") +
  labs(title = "Comparison of Kernel method",
       x = "Kernel",
       y = "Adjusted R2") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

Note

We will pick the best performing selection from each bandwidth and compare it.

bandwidth_comparison <- read_xlsx("data/rds/capital_local_bandwidth_comparison.xlsx")

ggplot(bandwidth_comparison, aes(x = Bandwidth, y = `Adjusted R2`)) +
  geom_bar(stat = "identity", fill = "steelblue", color = "black") +
  labs(title = "Comparison of Approaches",
       x = "Bandwidth",
       y = "Adjusted R2") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))


Visualising Local R2

# using the best Bandwidth selection
bw.fixed_project <- bw.gwr(formula = total_project_count ~ 
                             `Entry Costs` + `Land Access` + Transparency + 
                             `Time Costs` + `Informal charges` + Proactivity + 
                             `Business Support Policy` + `Labor Policy` +
                             `Law & Order`,
                           data=pci_2021,
                           approach="aic", 
                           kernel="boxcar", 
                           adaptive=FALSE, 
                           longlat=FALSE)
Fixed bandwidth: 968784.2 AICc value: 1183.12 
Fixed bandwidth: 598861.3 AICc value: 1199.317 
Fixed bandwidth: 1197409 AICc value: 1164.742 
Fixed bandwidth: 1338707 AICc value: 1164.967 
Fixed bandwidth: 1110082 AICc value: 1176.805 
Fixed bandwidth: 1251380 AICc value: 1165.75 
Fixed bandwidth: 1164053 AICc value: 1163.872 
Fixed bandwidth: 1143438 AICc value: 1164.153 
Fixed bandwidth: 1176794 AICc value: 1165.42 
Fixed bandwidth: 1156179 AICc value: 1163.832 
Fixed bandwidth: 1151312 AICc value: 1163.795 
Fixed bandwidth: 1148305 AICc value: 1163.596 
Fixed bandwidth: 1146446 AICc value: 1164.296 
Fixed bandwidth: 1149453 AICc value: 1163.669 
Fixed bandwidth: 1147595 AICc value: 1163.596 
gwr.fixed_project <- gwr.basic(formula = total_project_count ~ 
                                 `Entry Costs` +  `Land Access` + Transparency + 
                                 `Time Costs` + `Informal charges` + Proactivity + 
                                 `Business Support Policy` + `Labor Policy` +
                                 `Law & Order`,
                               data=pci_2021,
                               bw=bw.fixed_project, 
                               kernel = 'boxcar', 
                               longlat = FALSE)

# Converting SDF into sf data.frame
pci_2021_project <- st_as_sf(gwr.fixed_project$SDF) %>%
  st_transform(crs=3405)


gwr.fixed.output_project <- as.data.frame(gwr.fixed_project$SDF)
pci_2021_project.fixed_project <- cbind(pci_2021_project, as.matrix(gwr.fixed.output_project))

glimpse(pci_2021_project.fixed_project)
Rows: 66
Columns: 74
$ Intercept                       <dbl> -2029.95467, -932.37899, 894.71344, -6…
$ X.Entry.Costs.                  <dbl> -407.7129, -478.9765, -409.1094, -399.…
$ X.Land.Access.                  <dbl> 529.76677, -54.62115, -141.09552, 481.…
$ Transparency                    <dbl> -325.60467, -32.50034, -166.54523, -37…
$ X.Time.Costs.                   <dbl> 353.1721, 450.6535, 411.2222, 396.8354…
$ X.Informal.charges.             <dbl> -411.3981, -193.0164, -231.4254, -546.…
$ Proactivity                     <dbl> -153.73344, -12.23230, -14.30985, -161…
$ X.Business.Support.Policy.      <dbl> 732.5428, 412.1067, 339.5726, 707.5992…
$ X.Labor.Policy.                 <dbl> 828.7843, 664.5272, 677.9096, 852.7153…
$ X.Law...Order.                  <dbl> -713.1389, -467.7242, -469.8577, -724.…
$ y                               <dbl> 31, 595, 4, 15, 1820, 65, 99, 4073, 41…
$ yhat                            <dbl> -400.13304, 249.19696, 174.39859, 90.2…
$ residual                        <dbl> 431.13304, 345.80304, -170.39859, -75.…
$ CV_Score                        <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ Stud_residual                   <dbl> 0.37323197, 0.28012187, -0.17698405, -…
$ Intercept_SE                    <dbl> 5290.157, 4717.062, 5085.611, 7461.838…
$ X.Entry.Costs._SE               <dbl> 508.1250, 419.8693, 446.7241, 585.5165…
$ X.Land.Access._SE               <dbl> 525.4821, 520.0565, 559.0493, 673.0689…
$ Transparency_SE                 <dbl> 404.8617, 430.3879, 462.4965, 432.1743…
$ X.Time.Costs._SE                <dbl> 380.3176, 417.3923, 485.3297, 426.4734…
$ X.Informal.charges._SE          <dbl> 441.0214, 491.4422, 594.5691, 520.9128…
$ Proactivity_SE                  <dbl> 522.4891, 455.9661, 515.8280, 557.3436…
$ X.Business.Support.Policy._SE   <dbl> 294.3359, 305.0581, 344.2616, 366.8382…
$ X.Labor.Policy._SE              <dbl> 402.7943, 349.9538, 374.2111, 433.6041…
$ X.Law...Order._SE               <dbl> 515.3410, 542.8703, 603.1861, 599.5543…
$ Intercept_TV                    <dbl> -0.383722975, -0.197660957, 0.17593036…
$ X.Entry.Costs._TV               <dbl> -0.8023870, -1.1407754, -0.9157988, -0…
$ X.Land.Access._TV               <dbl> 1.00815386, -0.10502926, -0.25238478, …
$ Transparency_TV                 <dbl> -0.80423678, -0.07551406, -0.36010053,…
$ X.Time.Costs._TV                <dbl> 0.9286239, 1.0796883, 0.8473048, 0.930…
$ X.Informal.charges._TV          <dbl> -0.9328302, -0.3927550, -0.3892321, -1…
$ Proactivity_TV                  <dbl> -0.29423281, -0.02682720, -0.02774152,…
$ X.Business.Support.Policy._TV   <dbl> 2.4887983, 1.3509118, 0.9863797, 1.928…
$ X.Labor.Policy._TV              <dbl> 2.057587, 1.898900, 1.811570, 1.966576…
$ X.Law...Order._TV               <dbl> -1.3838194, -0.8615763, -0.7789598, -1…
$ Local_R2                        <dbl> 0.3754886, 0.4586589, 0.5009512, 0.391…
$ Intercept.1                     <named list> -2029.955, -932.379, 894.7134, …
$ X.Entry.Costs..1                <named list> -407.7129, -478.9765, -409.1094…
$ X.Land.Access..1                <named list> 529.7668, -54.62115, -141.0955,…
$ Transparency.1                  <named list> -325.6047, -32.50034, -166.5452…
$ X.Time.Costs..1                 <named list> 353.1721, 450.6535, 411.2222, 3…
$ X.Informal.charges..1           <named list> -411.3981, -193.0164, -231.4254…
$ Proactivity.1                   <named list> -153.7334, -12.2323, -14.30985,…
$ X.Business.Support.Policy..1    <named list> 732.5428, 412.1067, 339.5726, 7…
$ X.Labor.Policy..1               <named list> 828.7843, 664.5272, 677.9096, 8…
$ X.Law...Order..1                <named list> -713.1389, -467.7242, -469.8577…
$ y.1                             <named list> 31, 595, 4, 15, 1820, 65, 99, 4…
$ yhat.1                          <named list> -400.133, 249.197, 174.3986, 90…
$ residual.1                      <named list> 431.133, 345.803, -170.3986, -7…
$ CV_Score.1                      <named list> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ Stud_residual.1                 <named list> 0.373232, 0.2801219, -0.176984,…
$ Intercept_SE.1                  <named list> 5290.157, 4717.062, 5085.611, 7…
$ X.Entry.Costs._SE.1             <named list> 508.125, 419.8693, 446.7241, 58…
$ X.Land.Access._SE.1             <named list> 525.4821, 520.0565, 559.0493, 6…
$ Transparency_SE.1               <named list> 404.8617, 430.3879, 462.4965, 4…
$ X.Time.Costs._SE.1              <named list> 380.3176, 417.3923, 485.3297, 4…
$ X.Informal.charges._SE.1        <named list> 441.0214, 491.4422, 594.5691, 5…
$ Proactivity_SE.1                <named list> 522.4891, 455.9661, 515.828, 55…
$ X.Business.Support.Policy._SE.1 <named list> 294.3359, 305.0581, 344.2616, 3…
$ X.Labor.Policy._SE.1            <named list> 402.7943, 349.9538, 374.2111, 4…
$ X.Law...Order._SE.1             <named list> 515.341, 542.8703, 603.1861, 59…
$ Intercept_TV.1                  <named list> -0.383723, -0.197661, 0.1759304…
$ X.Entry.Costs._TV.1             <named list> -0.802387, -1.140775, -0.915798…
$ X.Land.Access._TV.1             <named list> 1.008154, -0.1050293, -0.252384…
$ Transparency_TV.1               <named list> -0.8042368, -0.07551406, -0.360…
$ X.Time.Costs._TV.1              <named list> 0.9286239, 1.079688, 0.8473048,…
$ X.Informal.charges._TV.1        <named list> -0.9328302, -0.392755, -0.38923…
$ Proactivity_TV.1                <named list> -0.2942328, -0.0268272, -0.0277…
$ X.Business.Support.Policy._TV.1 <named list> 2.488798, 1.350912, 0.9863797, …
$ X.Labor.Policy._TV.1            <named list> 2.057587, 1.8989, 1.81157, 1.96…
$ X.Law...Order._TV.1             <named list> -1.383819, -0.8615763, -0.77895…
$ Local_R2.1                      <named list> 0.3754886, 0.4586589, 0.5009512…
$ geometry.1                      <named list> [MULTIPOLYGON (((519993.2 12...…
$ geometry                        <MULTIPOLYGON [m]> MULTIPOLYGON (((519993.2 …
# Set tmap options to check and fix any invalid polygons
tmap_options(check.and.fix = TRUE)

tmap_mode("view")
tmap mode set to interactive viewing
str(pci_2021_project.fixed_project$Local_R2)
 num [1:66] 0.375 0.459 0.501 0.392 0.45 ...
pci_2021_project.fixed_project$Local_R2 <- unlist(pci_2021_project.fixed_project$Local_R2)


tm_shape(provincial_boundaries)+
  tm_polygons(alpha = 0.1) +
tm_shape(pci_2021_project.fixed_project) +  
  tm_polygons(col = "Local_R2",
          border.col = "gray60",
          border.lwd = 1) +
  tm_view(set.zoom.limits = c(5,8))
Warning: The shape provincial_boundaries is invalid (after reprojection). See
sf::st_is_valid
Warning: The shape pci_2021_project.fixed_project is invalid (after
reprojection). See sf::st_is_valid
bw.fixed_capital <- bw.gwr(formula = total_registered_capital ~ 
                             `Entry Costs` + `Land Access` + Transparency + 
                             `Time Costs` + `Informal charges` + Proactivity + 
                             `Business Support Policy` + `Labor Policy` +
                             `Law & Order`,
                           data=pci_2021,
                           approach="CV", 
                           kernel="boxcar", 
                           adaptive=FALSE, 
                           longlat=FALSE)
Fixed bandwidth: 968784.2 CV score: 6660001230 
Fixed bandwidth: 598861.3 CV score: 8381045486 
Fixed bandwidth: 1197409 CV score: 5518004098 
Fixed bandwidth: 1338707 CV score: 5643658360 
Fixed bandwidth: 1110082 CV score: 5968840530 
Fixed bandwidth: 1251380 CV score: 5680684732 
Fixed bandwidth: 1164053 CV score: 5557485916 
Fixed bandwidth: 1218024 CV score: 5462660387 
Fixed bandwidth: 1230765 CV score: 5519707752 
Fixed bandwidth: 1210150 CV score: 5429675517 
Fixed bandwidth: 1205283 CV score: 5419111082 
Fixed bandwidth: 1202276 CV score: 5449814745 
Fixed bandwidth: 1207142 CV score: 5380297063 
Fixed bandwidth: 1208291 CV score: 5418884413 
Fixed bandwidth: 1206432 CV score: 5380297063 
gwr.fixed_capital <- gwr.basic(formula = total_registered_capital ~ 
                                 `Entry Costs` + `Land Access` + Transparency + 
                                 `Time Costs` + `Informal charges` + Proactivity + 
                                 `Business Support Policy` + `Labor Policy` +
                                 `Law & Order`,
                               data=pci_2021,
                               bw=bw.fixed_capital, 
                               kernel = 'boxcar', 
                               longlat = FALSE)

# Converting SDF into sf data.frame
pci_2021_capital <- st_as_sf(gwr.fixed_capital$SDF) %>%
  st_transform(crs=3405)


gwr.adaptive.output_capital <- as.data.frame(gwr.adaptive_capital$SDF)
pci_2021_capital.adaptive_capital <- cbind(pci_2021_capital, as.matrix(gwr.adaptive.output_capital))

glimpse(pci_2021_capital.adaptive_capital)
Rows: 66
Columns: 74
$ Intercept                       <dbl> -40296.92, -39206.72, -34664.63, -2992…
$ X.Entry.Costs.                  <dbl> -6315.880, -3915.295, -1463.702, -7003…
$ X.Land.Access.                  <dbl> 3115.6525, 1336.2587, -596.5612, 3301.…
$ Transparency                    <dbl> -659.33327, 930.72608, 674.52694, -166…
$ X.Time.Costs.                   <dbl> 6286.779, 5627.989, 3601.156, 8412.177…
$ X.Informal.charges.             <dbl> -2229.732, -4638.050, -3234.338, -3852…
$ Proactivity                     <dbl> -1746.3700, -2379.3987, 471.3656, -357…
$ X.Business.Support.Policy.      <dbl> 6438.578, 5805.181, 2519.876, 5362.128…
$ X.Labor.Policy.                 <dbl> 6851.822, 7252.902, 5818.277, 6408.295…
$ X.Law...Order.                  <dbl> -4213.7233, -2292.8904, -1192.7572, -2…
$ y                               <dbl> 317.31, 9408.00, 7.90, 4496.04, 23317.…
$ yhat                            <dbl> 4256.9962, 4519.5964, 1586.6253, -667.…
$ residual                        <dbl> -3939.6862, 4888.4036, -1578.7253, 516…
$ CV_Score                        <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ Stud_residual                   <dbl> -0.53888519, 0.63967249, -0.25299867, …
$ Intercept_SE                    <dbl> 29736.23, 27186.35, 28928.14, 39156.01…
$ X.Entry.Costs._SE               <dbl> 2877.859, 2366.848, 2616.283, 3203.628…
$ X.Land.Access._SE               <dbl> 3058.535, 3106.294, 3204.270, 3867.818…
$ Transparency_SE                 <dbl> 2240.409, 2448.362, 2731.885, 2533.762…
$ X.Time.Costs._SE                <dbl> 2171.512, 2507.000, 2704.613, 2549.890…
$ X.Informal.charges._SE          <dbl> 2526.097, 2825.835, 3289.331, 2983.222…
$ Proactivity_SE                  <dbl> 2656.150, 2691.953, 2894.385, 3245.829…
$ X.Business.Support.Policy._SE   <dbl> 1768.310, 1660.972, 1921.297, 2001.774…
$ X.Labor.Policy._SE              <dbl> 1925.310, 1990.376, 2152.967, 2517.497…
$ X.Law...Order._SE               <dbl> 2882.135, 3208.448, 3324.414, 3299.518…
$ Intercept_TV                    <dbl> -1.3551455, -1.4421471, -1.1983015, -0…
$ X.Entry.Costs._TV               <dbl> -2.1946452, -1.6542232, -0.5594587, -2…
$ X.Land.Access._TV               <dbl> 1.0186748, 0.4301779, -0.1861769, 0.85…
$ Transparency_TV                 <dbl> -0.29429144, 0.38014231, 0.24690897, -…
$ X.Time.Costs._TV                <dbl> 2.895116, 2.244910, 1.331486, 3.299035…
$ X.Informal.charges._TV          <dbl> -0.8826785, -1.6413025, -0.9832814, -1…
$ Proactivity_TV                  <dbl> -0.65748173, -0.88389322, 0.16285519, …
$ X.Business.Support.Policy._TV   <dbl> 3.641091, 3.495050, 1.311549, 2.678687…
$ X.Labor.Policy._TV              <dbl> 3.558815, 3.643986, 2.702445, 2.545502…
$ X.Law...Order._TV               <dbl> -1.4620144, -0.7146416, -0.3587872, -0…
$ Local_R2                        <dbl> 0.6188694, 0.6077208, 0.3747035, 0.587…
$ Intercept.1                     <named list> -45108.2, -32649.02, -32649.02,…
$ X.Entry.Costs..1                <named list> -4766.074, -3639.628, -3639.628…
$ X.Land.Access..1                <named list> 3138.823, 2466.781, 2466.781, 3…
$ Transparency.1                  <named list> 497.1627, 300.1028, 300.1028, 4…
$ X.Time.Costs..1                 <named list> 5507.525, 5366.96, 5366.96, 550…
$ X.Informal.charges..1           <named list> -2938.525, -3911.892, -3911.892…
$ Proactivity.1                   <named list> -2582.493, -2272.937, -2272.937…
$ X.Business.Support.Policy..1    <named list> 5991.509, 5118.093, 5118.093, 5…
$ X.Labor.Policy..1               <named list> 7293.703, 6604.062, 6604.062, 7…
$ X.Law...Order..1                <named list> -3605.304, -3388.805, -3388.805…
$ y.1                             <named list> 317.31, 9408, 7.9, 4496.04, 233…
$ yhat.1                          <named list> 4460.401, 5480.021, 6638.747, 7…
$ residual.1                      <named list> -4143.091, 3927.979, -6630.847,…
$ CV_Score.1                      <named list> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ Stud_residual.1                 <named list> -0.5374782, 0.5033235, -0.94289…
$ Intercept_SE.1                  <named list> 27299.05, 26139.39, 26139.39, 2…
$ X.Entry.Costs._SE.1             <named list> 2220.128, 2192.642, 2192.642, 2…
$ X.Land.Access._SE.1             <named list> 2812.749, 2722.712, 2722.712, 2…
$ Transparency_SE.1               <named list> 2002.072, 1967.115, 1967.115, 2…
$ X.Time.Costs._SE.1              <named list> 2132.695, 2088.533, 2088.533, 2…
$ X.Informal.charges._SE.1        <named list> 2415.742, 2330.71, 2330.71, 241…
$ Proactivity_SE.1                <named list> 2368.776, 2374.589, 2374.589, 2…
$ X.Business.Support.Policy._SE.1 <named list> 1498.989, 1496.089, 1496.089, 1…
$ X.Labor.Policy._SE.1            <named list> 1749.472, 1698.588, 1698.588, 1…
$ X.Law...Order._SE.1             <named list> 2719.473, 2679.096, 2679.096, 2…
$ Intercept_TV.1                  <named list> -1.652372, -1.249035, -1.249035…
$ X.Entry.Costs._TV.1             <named list> -2.146756, -1.659929, -1.659929…
$ X.Land.Access._TV.1             <named list> 1.115927, 0.9060013, 0.9060013,…
$ Transparency_TV.1               <named list> 0.2483241, 0.1525599, 0.1525599…
$ X.Time.Costs._TV.1              <named list> 2.582425, 2.569728, 2.569728, 2…
$ X.Informal.charges._TV.1        <named list> -1.216407, -1.678412, -1.678412…
$ Proactivity_TV.1                <named list> -1.090223, -0.9571917, -0.95719…
$ X.Business.Support.Policy._TV.1 <named list> 3.997034, 3.420982, 3.420982, 3…
$ X.Labor.Policy._TV.1            <named list> 4.169087, 3.887972, 3.887972, 4…
$ X.Law...Order._TV.1             <named list> -1.325736, -1.264906, -1.264906…
$ Local_R2.1                      <named list> 0.5983664, 0.5983664, 0.5983664…
$ geometry.1                      <named list> [MULTIPOLYGON (((519993.2 12...…
$ geometry                        <MULTIPOLYGON [m]> MULTIPOLYGON (((519993.2 …
# Set tmap options to check and fix any invalid polygons
tmap_options(check.and.fix = TRUE)

tmap_mode("view")
tmap mode set to interactive viewing
str(pci_2021_capital.adaptive_capital$Local_R2)
 num [1:66] 0.619 0.608 0.375 0.587 0.61 ...
pci_2021_capital.adaptive_capital$Local_R2 <- unlist(pci_2021_capital.adaptive_capital$Local_R2)


tm_shape(provincial_boundaries)+
  tm_polygons(alpha = 0.1) +
tm_shape(pci_2021_capital.adaptive_capital) +  
  tm_polygons(col = "Local_R2",
          border.col = "gray60",
          border.lwd = 1) +
  tm_view(set.zoom.limits = c(5,8))
Warning: The shape provincial_boundaries is invalid (after reprojection). See
sf::st_is_valid
Warning: The shape pci_2021_capital.adaptive_capital is invalid (after
reprojection). See sf::st_is_valid


Visualising coefficient estimates

tmap_mode("view")
tmap mode set to interactive viewing
AREA_SQM_SE <- tm_shape(provincial_boundaries)+
  tm_polygons(alpha = 0.1) +
tm_shape(pci_2021_project.fixed_project) +  
  tm_polygons(col = "Transparency_SE",
          border.col = "gray60",
          border.lwd = 1) +
  tm_view(set.zoom.limits = c(5,8))

AREA_SQM_TV <- tm_shape(provincial_boundaries)+
  tm_polygons(alpha = 0.1) +
tm_shape(pci_2021_project.fixed_project) +  
  tm_polygons(col = "Transparency_TV",
          border.col = "gray60",
          border.lwd = 1) +
  tm_view(set.zoom.limits = c(5,8))

tmap_arrange(AREA_SQM_SE, AREA_SQM_TV, 
             asp=1, ncol=2,
             sync = TRUE)
Warning: The shape provincial_boundaries is invalid (after reprojection). See
sf::st_is_valid
Warning: The shape pci_2021_project.fixed_project is invalid (after
reprojection). See sf::st_is_valid
Warning: The shape provincial_boundaries is invalid (after reprojection). See
sf::st_is_valid
Warning: The shape pci_2021_project.fixed_project is invalid (after
reprojection). See sf::st_is_valid
Variable(s) "Transparency_TV" contains positive and negative values, so midpoint is set to 0. Set midpoint = NA to show the full spectrum of the color palette.



6.0 Shiny Storyboard

6.1 Global Explanatory Model

6.1.1 Multiple Linear Regression (MLR)

Multiple Linear Regression (MLR) provides a baseline model to predict an outcome using multiple predictor variables. It assumes a consistent, global relationship across all data points, offering a broad understanding of how these variables influence the dependent variable overall.

Users have the flexibility to select specific combinations of predictor variables (e.g., PCI factors) to explore different modeling outcomes and compare how these combinations affect results.

Interpretation: A higher Adjusted R2 indicates that the model more effectively explains the variation in the outcome. Additionally, p-value reflects the statistical significance of the model, with lower values indicating stronger confidence in the predictors’ influence.


6.1.2 Stepwise Model Selection

Stepwise Model Selection allows users to refine the Multiple Linear Regression model by adding or removing predictor variables in a systematic way.

Users can select from 3 approaches:

  • forward selection (starting with no predictors and adding them),

  • backward elimination (starting with all predictors and removing them),

  • or both (a combination of adding and removing).

Users also have control over the confidence level (e.g., 0.95 or 0.99), adjusting the stringency for including predictors based on statistical significance.

Interpretation and Visualization: A radar chart provides a visual comparison of different models, allowing users to select and view how models perform across chosen predictors and confidence levels. This helps in identifying the most effective model based on the desired balance of predictors and statistical robustness.


6.1.3 Visualise Model Parameters

Visualize Model Parameters offers users an interactive way to examine and compare the effects of predictor variables across different stepwise models. Users can select their preferred model from the stepwise selection results and sort parameter values in ascending or descending order for easier comparison.

Interpretation and Customization: This visualization provides a clear view of each predictor’s influence on the outcome variable within the selected model, helping users assess the relative importance of predictors. By sorting the parameters, users can quickly identify the most impactful variables or spot subtle differences across models, aiding in deeper analysis and model refinement.


6.2 Local Explanatory Model

6.2.1 Bandwidth Selection

Bandwidth Selection in Geographically Weighted Regression (GWR) allows users to fine-tune the model’s spatial sensitivity by choosing between fixed and adaptive bandwidth options.

  • Side-by-Side Comparison:
    Users can compare the effects of fixed vs. adaptive bandwidths side-by-side. This comparison highlights how each bandwidth type influences the spatial scale of the analysis:

    • Fixed bandwidth applies a constant spatial radius for all data points, which is ideal for evenly spaced data.

    • Adaptive bandwidth adjusts based on data density, using a larger bandwidth in sparse areas and a smaller one in dense areas, enhancing accuracy in regions with varying data distributions.

  • Selection Options for Each Bandwidth Type:

    • Approach: Users can choose from cross-validation or A/C corrected methods to optimize the bandwidth.

    • Kernel Method: Users can select the kernel type (e.g., Gaussian, bisquare, or tricube) to define the shape and weighting of spatial influence around each data point, tailoring the model to the spatial structure of the data.

Purpose:
This flexible bandwidth selection process allows users to determine the optimal balance of local vs. global influence, enhancing model accuracy and providing insights into spatial patterns at different scales.


6.2.2 Visualise Local R2

Visualizing Local R2 provides an in-depth look at how well the Geographically Weighted Regression (GWR) model explains variations in the outcome across different areas. This visualization highlights spatial differences in model performance, making it easier to identify regions where predictors more effectively capture local patterns.

  • Customization Options:

    • Bandwidth Type: Choose between fixed or adaptive bandwidth to control the scale of spatial influence.

    • Bandwidth Optimization Approach: Select cross-validation or A/C corrected to determine the optimal bandwidth, allowing a focus on either predictive accuracy or model simplicity.

    • Kernel Method: Choose the kernel type (e.g., Gaussian, bisquare, or tricube) to set the shape and weighting of spatial influence, refining how local R2 values are calculated across locations.


This flexible visualization helps users assess where the model has strong or weak explanatory power across the study area, offering insights into local model fit. By adjusting bandwidth, approach, and kernel, users can explore how these choices impact model performance, identifying areas with robust predictions and areas needing further investigation.